python - Pandas: reshaping data -

June 15, 2010

i have pandas series presently looks this:

14    [yellow, pizza, restaurants] ... 160920                  [automotive, auto parts & supplies] 160921       [lighting fixtures & equipment, home services] 160922                 [food, pizza, candy stores] 160923           [hair removal, nail salons, beauty & spas] 160924           [hair removal, nail salons, beauty & spas]

and want radically reshape dataframe looks this...

      yellow  automotive  pizza 14       1         0        1 …            160920   0         1        0 160921   0         0        0 160922   0         0        1 160923   0         0        0 160924   0         0        0

ie. logical construction noting categories each observation(row) falls into.

i'm capable of writing loop based code tackle problem, given large number of rows need handle, that's going slow.

does know vectorised solution kind of problem? i'd grateful.

edit: there 509 categories, have list of.

in [9]: s = series([list('abc'),list('def'),list('abef')])  in [10]: s out[10]:  0       [a, b, c] 1       [d, e, f] 2    [a, b, e, f] dtype: object  in [11]: s.apply(lambda x: series(1,index=x)).fillna(0) out[11]:      b  c  d  e  f 0  1  1  1  0  0  0 1  0  0  0  1  1  1 2  1  1  0  0  1  1

Search This Blog

Three

python - Pandas: reshaping data -

Comments

Post a Comment

Popular posts from this blog

.htaccess - First slash is removed after domain when entering a webpage in the browser -

Socket.connect doesn't throw exception in Android -

SPSS keyboard combination alters encoding -