SQL Server What are the downsides of making a numeric column a string in a database?

kmpatx3s  于 2023-08-02  发布在  其他
关注(0)|答案(4)|浏览(123)

I have members on my team who are tossing around the idea to make every column in the database a string including numeric columns.

I know that sorting becomes an issue and data integrity becomes an issue and performance becomes and issue.

What are the other downsides of making a numeric column a string just for edification and so I can share with the team members?

qjp7pelc

qjp7pelc1#

The major issue is that users can put broken data into the columns -- data that is not numeric. That is simply not possible with the correct type. Although you could add a check constraint for every numeric column, that seems like a lot of work.

The scenario is: You have a query that works and has worked for a long time. All of a sudden, someone puts a non-numeric value into the column. The query breaks. And because the query was (probably) using implicit conversion, it is really hard to tell where the problem is.

Let me just say: I am speaking from experience here.

Other problems are:

  • Comparisons don't work as expected: '0' <> '0.0' .
  • Comparisons don't work as expected: '9' > '100' .
  • Comparisons don't work as expected: '.1' < '0.01' .
  • Sorting doesn't work as expected.
  • The code is filled with (unnecessary and typically implicit) conversions.
  • Some databases, such as SQL Server, overload operators so '1' + '1' <> '2' .
  • Some databases overload operators, so current_timestamp + 1 is valid but current_timestamp + '1' is invalid.
  • A comparison in a query can affect index usage. So, strcol = 1 ends up converting strcol to a number, which typically precludes the use of an index. On the other hand, intcol = '1' ends up converting the constant to a number, which still allows the index to be used. I do not recommend mixing types in comparisons, though.

Space is a wash, because in many cases the string representation might be smaller than the number representation. It depends in that case. There is a slight hit on indexing, because fixed length keys are usually more efficient.

If you mix types, things get worse -- because that affects the optimizer.

Some things that are composed of numbers are not necessarily numeric. You can usually tell the difference easily: does it make sense to perform arithmetic operations on the value? Or another indicator: do leading zeros make sense?

xbp102n0

xbp102n02#

  • it will take more space
  • indexes will also take more space and be less efficient
  • ordering will not work correctly (e.g. "10" < "2")
  • any numeric operations will not work correctly (e.g. 10% more than x)

having said all this, fields like SSN, phone number, etc. that appear numeric but are not really numbers should be strings.

iqjalb3h

iqjalb3h3#

In general, if the numeric column is an ID and never used for calculations, it is probably OK. If the numbers are "measures", like amount or quantity, I would not recommend it as you most likely would want to do calculations at some point (like SUM, AVG, etc)

kgqe7b3p

kgqe7b3p4#

I got this type of issue to an externally designed db faced lots of challenges:

  1. Conversion of date, numeric columns during querying
  2. Indexing took more space and has slower performance

相关问题