Skip to content

MultiIndexing Issue #8893

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
Aurthes opened this issue Nov 25, 2014 · 4 comments
Closed

MultiIndexing Issue #8893

Aurthes opened this issue Nov 25, 2014 · 4 comments

Comments

@Aurthes
Copy link

Aurthes commented Nov 25, 2014

Let 'b' be a large dataframe with MulitiIndex.

I pull the 4th block based on the 1st level of the index, and I got the block fine.

a = b.ix[b.index.levels[0][4]]

Then I printed the index of the block, which also seemed fine, code and result as follows:

print a.index

sequence attribute
4 count
price
quantity

and when I pull values, they seemed fine too:

a.index.values

array([(4L, 'count'), (4L, 'price'), (4L, 'quantity')], dtype=object)

but when I looked at it in a different way, things didn't make much sense:

a.index

MultiIndex(levels=[[0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95, 96, 97, 98, 99, ...], [u'count', u'price', u'quantity']],
labels=[[4, 4, 4], [0, 1, 2]],
names=[u'sequence', u'attribute'])

and when I pull the first level from the index of the block, I got:

a.index.levels[0]

Int64Index([0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55, 56, 57, 58, 59, 60, 61, 62, 63, 64, 65, 66, 67, 68, 69, 70, 71, 72, 73, 74, 75, 76, 77, 78, 79, 80, 81, 82, 83, 84, 85, 86, 87, 88, 89, 90, 91, 92, 93, 94, 95, 96, 97, 98, 99, ...], dtype='int64')

Which is not the

Int64Index([4], dtype='int64')

I anticipated.

Of course I can

pd.MultiIndex.from_tuples(a.index.values)

and create the thing I am expecting, but this is a pain in the ass. Any solutions? Thanks!

@jreback
Copy link
Contributor

jreback commented Nov 25, 2014

you will have to show some copy-pastable example. not really sure what you are trying to do, nor what error you are seeing. and show pd.show_versions()

@Aurthes
Copy link
Author

Aurthes commented Nov 25, 2014

import pandas as pd
pd.show_versions()
def block_factory(lvl1,lvl2):
    idx1 = [lvl1]
    idx2 = [lvl2]
    idx3 = ['ha','haha','hahaha']
    idx = pd.MultiIndex.from_product([idx1,idx2,idx3])
    df = pd.DataFrame(range(3),index=idx)
    return df
df1=block_factory(1,'a')
df2=block_factory(2,'b')
df = pd.concat((df1,df2))
df_temp = df.ix[df.index.levels[0][1]]
df_temp
df_temp.index.levels[0]

I am anticipating 'b' only, while I get ['a','b']

@jreback
Copy link
Contributor

jreback commented Nov 26, 2014

I want you to pd.show_versions() and post it

@jreback
Copy link
Contributor

jreback commented Nov 26, 2014

using .levels is a semi-private method.
The reason that the levels are still their (but not indicated by the labels) is an efficiency/impl detail. You can reconstruct the index if you want, but it really doesn't affect anything. A more full description is in this issue #2770

In [16]: df_temp.index.get_level_values(0)
Out[16]: Index([u'b', u'b', u'b'], dtype='object')

In [17]: df_temp.index.get_level_values(1)
Out[17]: Index([u'ha', u'haha', u'hahaha'], dtype='object')

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants