algorithm - Find the optimum number of non uniform bins -


r - problem: find optimum number of non-uniform bins show range of data points.

i have bunch of data points (let assume different prices of different mobiles). need categorize these mobile phones categories (based on price). bin size (in example refers price range) need not uniform (there might lots of mobiles in low price category , few in long tail category).

is there efficient algorithm find optimum number of bins required , number of data points (in case mobile phones) shall go each category.

this not standard formula, wanted post seem work data set tested.

  1. find average price of mobiles.

    ex: 5 mobiles prices 10, 20, 40, 80, 200

    avg 350/5 = 70

  2. subtract minimum price average price: 70 - 10 = 60 -> name n1

  3. subtract avg price max price: 200 - 70 = 130 -> name n2

  4. find ratio n2/n1 : 130/60: 2

    this indicates better have 2 bins @ lower price range every 1 bin @ higher range.

  5. so, example take 2 bins below 70. range 0 - 35(2 mobiles), 36 - 70(1 mobile)

    1 bin above 70: range 71 - 200(2 mobiles)

as can see, number of bins , bin sizes reasonably optimal.


Comments

Popular posts from this blog

.htaccess - First slash is removed after domain when entering a webpage in the browser -

Automatically create pages in phpfox -

c# - Farseer ContactListener is not working -