mysql - Dealing with 5,000 attributes -


i have data set contains 5,000 + attributes tables looks below

id attr1 attr2, attr3  0         1       0  1         0       0  0         0       0  0         0       1 

i wish represent each record on single row example table below make more amenable data mining via clustering.

id, attr1, attr2, attr3     1       1        1 

i have tried multitude of ways of doing have tried importing mysql db , getting max value each attribute (they can 1 or 0 each id) table cant hold 5,000 + attributes.

i have tried using pivot function in excel , getting max value per attribute number of columns pivot can handle far less 5,000 i'm looking at.

i have tried importing tableua suffers fact cant handle many records

i want table 2 in either text/csv file or database table

can suggest @ all, piece of software or have not yet considered

here python script ask for

def merge_rows_by_id(path):     rows = dict()     open(path) in_file:         header = in_file.readline().rstrip()         line in in_file:             fields = line.split()             id, attributes = fields[0], fields[1:]             if id not in rows:                 rows[id] = attributes             else:                 rows[id] = [max(x) x in zip(rows[id], attributes)]      print (header)     id in rows:         print ('{},{}'.format(id, ','.join(rows[id])))   merge_rows_by_id('my-data.txt') 

which written clarity more maximum efficiency, although it's pretty efficient. however, still leave lines 5000 attributes, fewer of them.

i've seen data "structure" used in bioinformatics researchers "put know "a" on 1 row, , set of "everything" doubles, , re-doubles, etc. i've had teach them data normalization make rdbm handle they've got. usually, attr_1…n 1 trial , attr_n+1…m second trial, , on allows sensible normalization of data.


Comments

Popular posts from this blog

SPSS keyboard combination alters encoding -

Add new record to the table by click on the button in Microsoft Access -

javascript - jQuery .height() return 0 when visible but non-0 when hidden -