Converting varchar/char dates to date type data.

By David J. Lake
2006-05-13

Normally people store dates as varchar/char types because they want to store their dates in a different format than yyyy-mm-dd as they are not used to displaying their dates in that way. They want to display October 23rd 2006 for instance or 10-23-2006.

There are two problems with that approach. The first is, that once you store the dates in that format, you lose all Date and Time functions. Without those, it is more difficult to find out which set of dates fall between now and April 24th for instance, or even for display purposes, difficult to order them correctly.

The second is that it is unnecessary. You can save your records in yyyy-mm-dd format that allows you to store them as a date type, and then you can use DATE_FORMAT to display your dates in a variety of ways.

Got you convinced now that you have read the documentation? Great. Now I can help you convert those nasty varchar/char dates over to a date type column with the records in the correct order.

If you are ready to jump in with both feet and worry about learning to swim after the fact, then see the entire code. If you want to test the waters first lets start off a little slower.

First lets construct a table and add some data:

create table ourdates( olddate varchar(10), newdate date );

insert into ourdates (olddate)
values
('23/06/02'), ('14/11/02'), ('10/08/02'), ('1/4/02');

Now even looking at those first two dates, it is at least obvious that they are of the format: day, month, year, the third could be month, day year and the last, well it could be just about anything.

Update:If you are using mysql 4.1.1 or newer you can use str_to_date. You specify your date string or column holding the date and then use the values from date_format to reformat your string to a proper date.

Using str_to_date

select str_to_date('00/00/0000', '%m/%d/%Y');
'0000-00-00'
select str_to_date('04/31/2004', '%m/%d/%Y');
'2004-04-31'
or to change an entire column:
select str_to_date(datecolumnname, '%m/%d/%Y');

Working with mysql 4.0 and earlier.

If you are using an older version of mysql you will need to use the code below.

So how do we work with those numbers to get them into our preferred yyyy-mm-dd format? Well we could use SUBSTRING() or the shorter variation SUBSTR() to do the work for us.

To grab the day we could use:

select substring('14/11/02' FROM 1 FOR 2);
'14'

Similarily we could use:

select substring('10/08/02' FROM 4 FOR 2);
'08'

Now those look like a reasonable solution and in fact there would be little wrong with using them, IF and that is a big if, we knew that our data did not contain any of the dates like the last one we input: '1/4/02'. The problem with the substring function now is, we don't know exactly how many characters we need to count. Well we do when we look at this one example, but when we are going through an entire table, how do we count correctly the number of characters for '14/11/02' while having a different character count for '1/4/02'?

So what do we do now that we have found this problem? Fortunately for us, mysql has another tool for our arsenal, substring_index.

You use substring_index to separate your data on some known separator, a dash for instance, or in the case of a URL usually a period like this:

SELECT SUBSTRING_INDEX('www.mysql.com', '.', 2);
'www.mysql'

SELECT SUBSTRING_INDEX('www.mysql.com', '.', -2);
'mysql.com'

With substring_index we are selecting the whole part between the separators. SO above we are searching on the '.' as the separator and the 2 in the first example is counting to the second occurance of the separator from the beginning of the string and returns everything before that seprator. Similarly the -2 has us count to the second occurance of the separator, but this time starting from the end of the string. It returns everything after that seprator to the end of the string.

What that will do is allow us to count our '/' separators and work with the data before or after a specific occurance of it. We then don't care if we have two digits, one digit or even one and two digits in our dates. The substring_index will allow us to break apart the data and work with it to reformat back into the format accepted by the date type.

Now, armed with that knowledge it is time to attack those pesky dates and turn them into something more useable.

To start let's assume that you have your dates stored as dd/mm/yy (if you have a different separator than '/' then merely change that in the code below.

select
date_add(
concat('20'
, substring_index(olddate,'/',-1)
, '-'
, substring_index(substring_index(olddate,'/',2),'/',-1)
, '-'
, substring_index(olddate,'/',1)
), interval 0 day) as mynewdate
from ourdates

Wow that was easy right? It does look like a jumbled mess though, lets step through it a bit at a time so we can figure out exactly how we are changing those dates around.

We are looking at a string in the form of dd/mm/yy. We would like to turn that into yyyy-mm-dd, we could use the / separator of course but I think more commonly dates are entered using the dash.

Let us use '23/06/02' for our example.

This part:
substring_index(olddate,'/',-1) says we are looking for the string '/' and we want to find the first occurance of that from the end of the string (the -1 says look from the end).

Now that gives us the year. The '20' at the beginning of the concat is prepended to that year bit. So now, we end up with 2002.

Now the second substring:
substring_index(substring_index(olddate,'/',2),'/',-1)
has to be broken up into two parts to understand it. The inner occurance says give me everything up until the second occurance of the '/'. That's this part: substring_index(olddate,'/',2). That would give us 23/06, those pieces being the day and the month of course.

Now want to split that piece up again to grab the month, we still want to grab that piece from the right same as above so again we are using -1 in that part. Kind of tricky but hey the database does what you tell it to do right? So now we've extracted the month, 06. That will get appended on to our 2002 and with the next line, we have added a '-' in there as well. So now we are up to 2002-06'

The last bit:
substring_index(olddate,'/',1)
tells us to grab everything before the first occurance of '/'. So that is our 23. Again we have the '-' in there so now we have: 2002-06-23.

Lastly we use the interval 0 day. This just basically tells the query that we are working with a date type here. That is in case our string was actually '23/6/2002'), where we are missing a digit. We don't actually want to add anthing to our string when we are done, just let the database know to correct for that.

Now you might have to read this over five or six times. I know I did when I first saw it. But there are so many times when I see people have messed up a date field, that it is nice to know how to piece it together again.