1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
|
====================================================================
A (second) proposal for implementing some date/time types in NumPy
====================================================================
:Author: Francesc Alted i Abad
:Contact: faltet@pytables.com
:Author: Ivan Vilata i Balaguer
:Contact: ivan@selidor.net
:Date: 2008-07-16
Executive summary
=================
A date/time mark is something very handy to have in many fields where
one has to deal with data sets. While Python has several modules that
define a date/time type (like the integrated ``datetime`` [1]_ or
``mx.DateTime`` [2]_), NumPy has a lack of them.
In this document, we are proposing the addition of a series of date/time
types to fill this gap. The requirements for the proposed types are
two-folded: 1) they have to be fast to operate with and 2) they have to
be as compatible as possible with the existing ``datetime`` module that
comes with Python.
Types proposed
==============
To start with, it is virtually impossible to come up with a single
date/time type that fills the needs of every case of use. So, after
pondering about different possibilities, we have stick with *two*
different types, namely ``datetime64`` and ``timedelta64`` (these names
are preliminary and can be changed), that can have different resolutions
so as to cover different needs.
**Important note:** the resolution is conceived here as a metadata that
*complements* a date/time dtype, *without changing the base type*.
Now it goes a detailed description of the proposed types.
``datetime64``
--------------
It represents a time that is absolute (i.e. not relative). It is
implemented internally as an ``int64`` type. The internal epoch is
POSIX epoch (see [3]_).
Resolution
~~~~~~~~~~
It accepts different resolutions and for each of these resolutions, it
will support different time spans. The table below describes the
resolutions supported with its corresponding time spans.
+----------------------+----------------------------------+
| Resolution | Time span (years) |
+----------------------+----------------------------------+
| Code | Meaning | |
+======================+==================================+
| Y | year | [9.2e18 BC, 9.2e18 AC] |
| Q | quarter | [3.0e18 BC, 3.0e18 AC] |
| M | month | [7.6e17 BC, 7.6e17 AC] |
| W | week | [1.7e17 BC, 1.7e17 AC] |
| d | day | [2.5e16 BC, 2.5e16 AC] |
| h | hour | [1.0e15 BC, 1.0e15 AC] |
| m | minute | [1.7e13 BC, 1.7e13 AC] |
| s | second | [ 2.9e9 BC, 2.9e9 AC] |
| ms | millisecond | [ 2.9e6 BC, 2.9e6 AC] |
| us | microsecond | [290301 BC, 294241 AC] |
| ns | nanosecond | [ 1678 AC, 2262 AC] |
+----------------------+----------------------------------+
Building a ``datetime64`` dtype
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
The proposed way to specify the resolution in the dtype constructor
is:
Using parameters in the constructor::
dtype('datetime64', res="us") # the default res. is microseconds
Using the long string notation::
dtype('datetime64[us]') # equivalent to dtype('datetime64')
Using the short string notation::
dtype('T8[us]') # equivalent to dtype('T8')
Compatibility issues
~~~~~~~~~~~~~~~~~~~~
This will be fully compatible with the ``datetime`` class of the
``datetime`` module of Python only when using a resolution of
microseconds. For other resolutions, the conversion process will
loose precision or will overflow as needed.
``timedelta64``
---------------
It represents a time that is relative (i.e. not absolute). It is
implemented internally as an ``int64`` type.
Resolution
~~~~~~~~~~
It accepts different resolutions and for each of these resolutions, it
will support different time spans. The table below describes the
resolutions supported with its corresponding time spans.
+----------------------+--------------------------+
| Resolution | Time span |
+----------------------+--------------------------+
| Code | Meaning | |
+======================+==========================+
| W | week | +- 1.7e17 years |
| D | day | +- 2.5e16 years |
| h | hour | +- 1.0e15 years |
| m | minute | +- 1.7e13 years |
| s | second | +- 2.9e12 years |
| ms | millisecond | +- 2.9e9 years |
| us | microsecond | +- 2.9e6 years |
| ns | nanosecond | +- 292 years |
| ps | picosecond | +- 106 days |
| fs | femtosecond | +- 2.6 hours |
| as | attosecond | +- 9.2 seconds |
+----------------------+--------------------------+
Building a ``timedelta64`` dtype
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
The proposed way to specify the resolution in the dtype constructor
is:
Using parameters in the constructor::
dtype('timedelta64', res="us") # the default res. is microseconds
Using the long string notation::
dtype('timedelta64[us]') # equivalent to dtype('datetime64')
Using the short string notation::
dtype('t8[us]') # equivalent to dtype('t8')
Compatibility issues
~~~~~~~~~~~~~~~~~~~~
This will be fully compatible with the ``timedelta`` class of the
``datetime`` module of Python only when using a resolution of
microseconds. For other resolutions, the conversion process will
loose precision or will overflow as needed.
Example of use
==============
Here it is an example of use for the ``datetime64``::
In [10]: t = numpy.zeros(5, dtype="datetime64[ms]")
In [11]: t[0] = datetime.datetime.now() # setter in action
In [12]: t[0]
Out[12]: '2008-07-16T13:39:25.315' # representation in ISO 8601 format
In [13]: print t
[2008-07-16T13:39:25.315 1970-01-01T00:00:00.0
1970-01-01T00:00:00.0 1970-01-01T00:00:00.0 1970-01-01T00:00:00.0]
In [14]: t[0].item() # getter in action
Out[14]: datetime.datetime(2008, 7, 16, 13, 39, 25, 315000)
In [15]: print t.dtype
datetime64[ms]
And here it goes an example of use for the ``timedelta64``::
In [8]: t1 = numpy.zeros(5, dtype="datetime64[s]")
In [9]: t2 = numpy.ones(5, dtype="datetime64[s]")
In [10]: t = t2 - t1
In [11]: t[0] = 24 # setter in action (setting to 24 seconds)
In [12]: t[0]
Out[12]: 24 # representation as an int64
In [13]: print t
[24 1 1 1 1]
In [14]: t[0].item() # getter in action
Out[14]: datetime.timedelta(0, 24)
In [15]: print t.dtype
timedelta64[s]
Operating with date/time arrays
===============================
``datetime64`` vs ``datetime64``
--------------------------------
The only operation allowed between absolute dates is the subtraction::
In [10]: numpy.ones(5, "T8") - numpy.zeros(5, "T8")
Out[10]: array([1, 1, 1, 1, 1], dtype=timedelta64[us])
But not other operations::
In [11]: numpy.ones(5, "T8") + numpy.zeros(5, "T8")
TypeError: unsupported operand type(s) for +: 'numpy.ndarray' and 'numpy.ndarray'
``datetime64`` vs ``timedelta64``
---------------------------------
It will be possible to add and subtract relative times from absolute
dates::
In [10]: numpy.zeros(5, "T8[Y]") + numpy.ones(5, "t8[Y]")
Out[10]: array([1971, 1971, 1971, 1971, 1971], dtype=datetime64[Y])
In [11]: numpy.ones(5, "T8[Y]") - 2 * numpy.ones(5, "t8[Y]")
Out[11]: array([1969, 1969, 1969, 1969, 1969], dtype=datetime64[Y])
But not other operations::
In [12]: numpy.ones(5, "T8[Y]") * numpy.ones(5, "t8[Y]")
TypeError: unsupported operand type(s) for *: 'numpy.ndarray' and 'numpy.ndarray'
``timedelta64`` vs anything
---------------------------
Finally, it will be possible to operate with relative times as if they
were regular int64 dtypes *as long as* the result can be converted back
into a ``timedelta64``::
In [10]: numpy.ones(5, 't8')
Out[10]: array([1, 1, 1, 1, 1], dtype=timedelta64[us])
In [11]: (numpy.ones(5, 't8[M]') + 2) ** 3
Out[11]: array([27, 27, 27, 27, 27], dtype=timedelta64[M])
But::
In [12]: numpy.ones(5, 't8') + 1j
TypeError: The result cannot be converted into a ``timedelta64``
dtype/resolution conversions
============================
For changing the date/time dtype of an existing array, we propose to use
the ``.astype()`` method. This will be mainly useful for changing
resolutions.
For example, for absolute dates::
In[10]: t1 = numpy.zeros(5, dtype="datetime64[s]")
In[11]: print t1
[1970-01-01T00:00:00 1970-01-01T00:00:00 1970-01-01T00:00:00
1970-01-01T00:00:00 1970-01-01T00:00:00]
In[12]: print t1.astype('datetime64[d]')
[1970-01-01 1970-01-01 1970-01-01 1970-01-01 1970-01-01]
For relative times::
In[10]: t1 = numpy.ones(5, dtype="timedelta64[s]")
In[11]: print t1
[1 1 1 1 1]
In[12]: print t1.astype('timedelta64[ms]')
[1000 1000 1000 1000 1000]
Changing directly from/to relative to/from absolute dtypes will not be
supported::
In[13]: numpy.zeros(5, dtype="datetime64[s]").astype('timedelta64')
TypeError: data type cannot be converted to the desired type
Final considerations
====================
Why the ``origin`` metadata disappeared
---------------------------------------
During the discussion of the date/time dtypes in the NumPy list, the
idea of having an ``origin`` metadata that complemented the definition
of the absolute ``datetime64`` was initially found to be useful.
However, after thinking more about this, Ivan and me find that the
combination of an absolute ``datetime64`` with a relative
``timedelta64`` does offer the same functionality while removing the
need for the additional ``origin`` metadata. This is why we have
removed it from this proposal.
Resolution and dtype issues
---------------------------
The date/time dtype's resolution metadata cannot be used in general as
part of typical dtype usage. For example, in::
numpy.zeros(5, dtype=numpy.datetime64)
we have to found yet a sensible way to pass the resolution. Perhaps the
next would work::
numpy.zeros(5, dtype=numpy.datetime64(res='Y'))
but we are not sure if this would collide with the spirit of the NumPy
dtypes.
At any rate, one can always do::
numpy.zeros(5, dtype=numpy.dtype('datetime64', res='Y'))
BTW, prior to all of this, one should also elucidate whether::
numpy.dtype('datetime64', res='Y')
or::
numpy.dtype('datetime64[Y]')
numpy.dtype('T8[Y]')
would be a consistent way to instantiate a dtype in NumPy. We do really
think that could be a good way, but we would need to hear the opinion of
the expert. Travis?
.. [1] http://docs.python.org/lib/module-datetime.html
.. [2] http://www.egenix.com/products/python/mxBase/mxDateTime
.. [3] http://en.wikipedia.org/wiki/Unix_time
.. Local Variables:
.. mode: rst
.. coding: utf-8
.. fill-column: 72
.. End:
|