python - Convert pandas timezone-aware DateTimeIndex to naive timestamp, but in certain timezone -
you can use function tz_localize
make timestamp or datetimeindex timezone aware, how can opposite: how can convert timezone aware timestamp naive one, while preserving timezone?
an example:
in [82]: t = pd.date_range(start="2013-05-18 12:00:00", periods=10, freq='s', tz="europe/brussels") in [83]: t out[83]: <class 'pandas.tseries.index.datetimeindex'> [2013-05-18 12:00:00, ..., 2013-05-18 12:00:09] length: 10, freq: s, timezone: europe/brussels
i remove timezone setting none, result converted utc (12 o'clock became 10):
in [86]: t.tz = none in [87]: t out[87]: <class 'pandas.tseries.index.datetimeindex'> [2013-05-18 10:00:00, ..., 2013-05-18 10:00:09] length: 10, freq: s, timezone: none
is there way can convert datetimeindex timezone naive, while preserving timezone set in?
some context on reason asking this: want work timezone naive timeseries (to avoid hassle timezones, , not need them case working on).
reason, have deal timezone-aware timeseries in local timezone (europe/brussels). other data timezone naive (but represented in local timezone), want convert timeseries naive further work it, has represented in local timezone (so remove timezone info, without converting user-visible time utc).
i know time internal stored utc , converted timezone when represent it, there has kind of conversion when want "delocalize" it. example, python datetime module can "remove" timezone this:
in [119]: d = pd.timestamp("2013-05-18 12:00:00", tz="europe/brussels") in [120]: d out[120]: <timestamp: 2013-05-18 12:00:00+0200 cest, tz=europe/brussels> in [121]: d.replace(tzinfo=none) out[121]: <timestamp: 2013-05-18 12:00:00>
so, based on this, following, suppose not efficient when working larger timeseries:
in [124]: t out[124]: <class 'pandas.tseries.index.datetimeindex'> [2013-05-18 12:00:00, ..., 2013-05-18 12:00:09] length: 10, freq: s, timezone: europe/brussels in [125]: pd.datetimeindex([i.replace(tzinfo=none) in t]) out[125]: <class 'pandas.tseries.index.datetimeindex'> [2013-05-18 12:00:00, ..., 2013-05-18 12:00:09] length: 10, freq: none, timezone: none
to answer own question, functionality has been added pandas in meantime. starting from pandas 0.15.0, can use tz_localize(none)
remove timezone resulting in local time.
see whatsnew entry: http://pandas.pydata.org/pandas-docs/stable/whatsnew.html#timezone-handling-improvements
so example above:
in [4]: t = pd.date_range(start="2013-05-18 12:00:00", periods=2, freq='h', tz= "europe/brussels") in [5]: t out[5]: datetimeindex(['2013-05-18 12:00:00+02:00', '2013-05-18 13:00:00+02:00'], dtype='datetime64[ns, europe/brussels]', freq='h')
using tz_localize(none)
removes timezone information resulting in naive local time:
in [6]: t.tz_localize(none) out[6]: datetimeindex(['2013-05-18 12:00:00', '2013-05-18 13:00:00'], dtype='datetime64[ns]', freq='h')
further, can use tz_convert(none)
remove timezone information converting utc, yielding naive utc time:
in [7]: t.tz_convert(none) out[7]: datetimeindex(['2013-05-18 10:00:00', '2013-05-18 11:00:00'], dtype='datetime64[ns]', freq='h')
this more performant datetime.replace
solution:
in [31]: t = pd.date_range(start="2013-05-18 12:00:00", periods=10000, freq='h', tz="europe/brussels") in [32]: %timeit t.tz_localize(none) 1000 loops, best of 3: 233 µs per loop in [33]: %timeit pd.datetimeindex([i.replace(tzinfo=none) in t]) 10 loops, best of 3: 99.7 ms per loop
Comments
Post a Comment