We want to obtain the full years in the range of 1949-2048 and, based on the knowledge of the data, would like to interpret digits equal to or larger than 49 as from the years 19** and others as from 20**: This is useful when we want to interpret the first two digits of the year components differently of a variable.įor instance, suppose we have a date variable date (string). The full year to be returned will be the year that does not exceed the largest value of the specified topyear. When working with the two-digit years, in addition to specifying the 19Y or 20Y masks, we can specify the topyear option in date and time function( string, mask, topyear). gen double anndatims = dhms(anndats,hh(anntims2),mm(anntims2),ss(anntims2)) We also want to create a new variable with both date and time from the variables anntims (string) and anndats (numeric). gen double anntims2 = clock(anntims,"hms") We want to generate a new time variable from it: Suppose we have the following dataset and the string variable anntims: Here are some examples of the strings and their corresponding masks: Source: datetime translation – String to numeric date translation functions In those functions, mask specifies the order of the components appearing in the string. In the last column of the table above we have introduced the functions function( string, mask) that transform strings to the numeric date and time variables. More on the syntax of display formats for dates and times from the Stata data management manual Conversion *Adapted from tables in Working with dates and times and datetime – Date and time values and variables. *Scroll left and right if the table exceeds the screen on mobile devices. For instance, there are three additional codes to display the hour: Hh (00-12), hH (0-23), or hh (0-12).Ĭheck the Stata help pages for the full list and the default display formats. There are multiple ways to display a year, month, week, day, hour, minute, second etc. % starts the formatting, which can be a number, date, string, business calendar etc. format timevar %tcHH:MM:SS further sets timevar to be displayed as hour (00-23) : minute (00-59) :second (00-60).įormat var %fmt changes the display format but not the contents of the variable. Here timevar is already a numeric date and time variable. format timevar %td sets timevar as a date variable. In fact, Stata understands a date and time variable as the difference from the base.The base (the numeric value 0) of a datetime variable begins at 01jan1960 00:00:00.000 (the first millisecond of 01jan1960) therefore “25jan2016 08:30:25” will be 1769329825000 (milliseconds) for Stata.ĭisplay format allows us to specify the output of the date/time variable. But Stata internally stores dates and times as integers and reads them as numeric values. Usually we do not need to worry about the storage types, and Stata will take care of that and find the most efficient ways to store data.ĭates and times usually come in the human readable string forms, such as “Ma16:15 pm”, “2017.03.22 16:15” etc. Storage types affect mainly how much memory will be needed. Strings are stored as str#, where # indicates the maximum length of the string. Numeric values are stored as five types differing in range and accuracy: byte, int, long, float or double. ** fix special case where only for reference year 2015, kids born onĭec 31 are getting odd ages of 19.000 for age15 and 18 for age15b.Dates and Times Data types and storage types The same issue didn't happen to any other Dec 31st children for any of the other years in my data.Īge15 = INT(yrdif(MemberDateOfBirth,'31Dec2015'd,'actual')) This only occurred for the children born 31Dec1996 using 31Dec2015 as a reference date. The reference to the variable "age15b" below was because when I tested the YRDIF method for age15 below without the INT function, the age resolved to 19.000. The precision of the MemberDateOfBirth (a stored SAS date variable) was causing the age15 variable to resolve to 18 with no decimal places for children born 31Dec1996 using 31Dec2015 as the reference year. With some recent data, I was using the YRDIF method and ran into some special cases where I was calculating ages on some claims data for children across three different years. Working with dates is always tricky and you should always check your results. Especially when trying to find newborns where age is less than 1. I stopped using it when I discovered that the calculation would sometimes produce odd results.
0 Comments
Leave a Reply. |
AuthorWrite something about yourself. No need to be fancy, just an overview. ArchivesCategories |