python - Pandas: reshaping data -

June 15, 2010

i have pandas series presently looks this:

14    [yellow, pizza, restaurants] ... 160920                  [automotive, auto parts & supplies] 160921       [lighting fixtures & equipment, home services] 160922                 [food, pizza, candy stores] 160923           [hair removal, nail salons, beauty & spas] 160924           [hair removal, nail salons, beauty & spas]

and want radically reshape dataframe looks this...

      yellow  automotive  pizza 14       1         0        1 …            160920   0         1        0 160921   0         0        0 160922   0         0        1 160923   0         0        0 160924   0         0        0

ie. logical construction noting categories each observation(row) falls into.

i'm capable of writing loop based code tackle problem, given large number of rows need handle, that's going slow.

does know vectorised solution kind of problem? i'd grateful.

edit: there 509 categories, have list of.

in [9]: s = series([list('abc'),list('def'),list('abef')])  in [10]: s out[10]:  0       [a, b, c] 1       [d, e, f] 2    [a, b, e, f] dtype: object  in [11]: s.apply(lambda x: series(1,index=x)).fillna(0) out[11]:      b  c  d  e  f 0  1  1  1  0  0  0 1  0  0  0  1  1  1 2  1  1  0  0  1  1

Search This Blog

Three

python - Pandas: reshaping data -

Comments

Post a Comment

Popular posts from this blog

SPSS keyboard combination alters encoding -

Add new record to the table by click on the button in Microsoft Access -

javascript - jQuery .height() return 0 when visible but non-0 when hidden -