-
-
Notifications
You must be signed in to change notification settings - Fork 18.4k
column multiindex and reindex inconsistency #16626
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
If you look at In [65]: df4.columns.levels[1]
Out[65]: Index(['c', 'a', 'd'], dtype='object') Going back to your first point, that there are no In [77]: df3.reindex(columns=[('x', 'a'), ('x', 'd')])
Out[77]:
x
a d
0 2 NaN
1 9 NaN
2 8 NaN
3 7 NaN
In [78]: df3.reindex(columns=[('x', 'a'), ('x', 'd')]).columns
Out[78]:
MultiIndex(levels=[['x'], ['a', 'd']],
labels=[[0, 0], [0, 1]]) Make sense? |
duplicate of this issue: #12319. This ATM is a deliberate choice. Feel free to comment on the other issue. |
The The In [62]: ci2 = pd.MultiIndex.from_tuples([('x','a'),('y','a'),('x','b'),('y','b'),('x','c')])
In [63]: df5 = pd.DataFrame(np.random.randint(0,10,(4,5)), columns=ci2)
In [64]: df6 = df5.drop(('x','c'), axis=1)
In [65]: df6
Out[65]:
x y x y
a a b b
0 0 0 8 4
1 7 2 0 4
2 4 2 2 8
3 6 1 7 5
In [66]: df6.columns
Out[66]:
MultiIndex(levels=[[u'x', u'y'], [u'a', u'b', u'c']],
labels=[[0, 1, 0, 1], [0, 0, 1, 1]])
In [67]: df6.columns.levels[1][pd.unique(df6.columns.labels[1])]
Out[67]: Index([u'a', u'b'], dtype='object') |
this is defined behavior on how s MultiIndex works |
@dfolch great! yeah its not automatic, but at least possible now (in an easy way). |
When passing a nonexistent column name to
reindex
on a dataframe without multiindex columns, the result is:NaN
column with the "new" column namecolumns
attribute matches the columns in the dataframeThe same action on a multiindex dataframe produces different results:
NaN
columns (this may not be a problem)columns
attribute of the resulting dataframe does not match the dateframe column names (this appears to be a bug)pandas: 0.19.1
nose: 1.3.7
pip: 9.0.1
setuptools: 27.2.0
Cython: 0.25.1
numpy: 1.11.2
scipy: 0.18.1
statsmodels: 0.6.1
xarray: None
IPython: 5.1.0
sphinx: 1.4.8
patsy: 0.4.1
dateutil: 2.6.0
pytz: 2016.7
blosc: None
bottleneck: 1.1.0
tables: 3.3.0
numexpr: 2.6.1
matplotlib: 1.5.3
openpyxl: 2.4.0
xlrd: 1.0.0
xlwt: 1.1.2
xlsxwriter: 0.9.3
lxml: 3.6.4
bs4: 4.5.1
html5lib: None
httplib2: None
apiclient: None
sqlalchemy: 1.1.4
pymysql: None
psycopg2: 2.6.2 (dt dec pq3 ext lo64)
jinja2: 2.8
boto: 2.43.0
pandas_datareader: None
The text was updated successfully, but these errors were encountered: