dataframe.drop(col,axis=1) does not drop column from column.levels in multiindex dataframe #3686

amol-desai · 2013-05-22T19:17:24Z

I have a multiindex dataframe from which I am dropping columns using df.drop(col,axis=1). Then, I am looking through column.levels[0] and doing some operations on all the columns. However, when I try to do this, pandas looks for the removed column since it is not removed from column.levels. Is this a bug? Is there a WAR?

Here is the df I am working with:

           ^GSPC         PF
        Adj Close  Adj Close
Date                             
2013-04-22    1562.50    4023.45
2013-04-23    1578.78    4099.20
2013-04-24    1578.79    4094.70
2013-04-25    1585.16    4124.25
2013-04-26    1582.24    4211.65
2013-04-29    1593.61    4340.75
2013-04-30    1597.57    4467.55
2013-05-01    1582.70    4432.25
2013-05-02    1597.59    4494.95
2013-05-03    1614.42    4539.55
2013-05-06    1617.50    4645.95
2013-05-07    1625.96    4624.65
2013-05-08    1632.69    4677.40
2013-05-09    1626.67    4637.25
2013-05-10    1633.70    4602.40
2013-05-13    1633.77    4618.60
2013-05-14    1650.34    4510.85
2013-05-15    1658.78    4362.00
2013-05-16    1650.47    4418.95
2013-05-17    1667.47    4406.95
2013-05-20    1666.29    4503.50
2013-05-21    1669.16    4471.20

Here is how I am dropping the columns:

    data = data.drop(stock.ticker,axis=1,level=0)

Here is where the issue is:

    print data.columns
    MultiIndex
    [(^GSPC, Adj Close), (PF, Adj Close)]

    print data.columns.labels
    [array([2, 3]), array([0, 0])]

    print data.columns.levels
    [Index([nvda, aapl, ^GSPC, PF], dtype=object), Index([Adj Close], dtype=object)]

Method used to generate DF as requested in the comment:
tickers is a list of stock ticker strings.
stock is an object that has a ticker property among others.
portfolio is an object that is a collection of stocks.

data = getdata.get_history(tickers,dt.today()-relativedelta(months=months))
data = data.drop(['Open','High','Low','Close','Volume'],axis=1)
data = data.unstack(0).swaplevel(0,1,axis=1).sortlevel(0,axis=1)
data['PF','Adj Close'] = np.zeros(len(data))
for stock in portfolio.getStocksInPortfolio():
  data['PF','Adj Close'] += data[stock.ticker,'Adj Close'] * stock.getSharesOwned()
  data = data.drop(stock.ticker,axis=1,level=0)

The text was updated successfully, but these errors were encountered:

cpcloud · 2013-05-22T22:09:01Z

not sure what you mean by WAR here.

cpcloud · 2013-05-22T22:47:05Z

I can confirm something similar, this might have something to do with the rebinding of data inside the loop.

randint = np.random.randint
columns = MultiIndex.from_tuples([('gspc', 'adj_close'),
                                    ('pf', 'adj_close'),
                                    ('aapl', 'adj_close'),
                                    ('nvda', 'adj_close')])
n = 50
x = np.random.randn(n) * 31.563 + 1621.127
y = np.random.randn(n) * 181.441 + 4442.121
z = np.random.randn(n) * 31.563 + 1621.127
w = np.random.randn(n) * 181.441 + 4442.121
data = DataFrame(np.column_stack((x, y, z, w)), columns=columns)
stocks = {'nvda': randint(100, 1000), 'gspc': randint(100, 1000), 'aapl': randint(100, 1000)}
data['pf', 'adj_close'] = np.zeros(n)
for ticker, nshares in stocks.iteritems():
    data['pf', 'adj_close'] += data[ticker, 'adj_close'] * nshares
    data = data.drop(ticker, axis=1, level=0)

amol-desai · 2013-05-22T23:12:39Z

WAR = workaround. Unconsciously threw that in. Sorry.

michaelaye · 2013-05-23T03:18:21Z

That's a long standing issue: #2770
I still don't understand precisely why Wes does not consider it a bug. Something with not what is observed...
I think either the 'levels' member is to be used for something or to be removed, as this way it is just confusing.

cpcloud · 2013-05-23T03:46:11Z

It seems to me that if you are explicitly dropping levels then the index should be recomputed since that's what is being requested. As a corner case storing a huge multi index while only storing a single column would be inefficient.

ghost · 2013-11-30T12:15:21Z

re #2770 (comment), this is not a bug (removed label),
That's not what multiindex levels are for.

The comment about wasting memory is relavent, but since the levels should be shared with the
original frame, that's less of a concern. IPython, which most use, is awful about keeping objects
in memory anyway. If you need a consolidate method, open an issue or use jtranter's suggestion.

@amol-desai, using that, you can reconstruct the frame to "squeeze" out unused level factors.
You should be looking through labels rather then levels, though, (depending on what you're tying to do).

jreback · 2014-02-26T22:34:07Z

closing as not a bug

... in Dataset context manager example notebook. Based on this pandas-dev/pandas#3686.

michaelaye mentioned this issue Nov 30, 2013

BUG: Indexes still include values that have been deleted #2770

Closed

jreback closed this as completed Feb 26, 2014

kdebrab mentioned this issue Apr 7, 2016

pandas dataframe.drop(col,axis=1) does not drop column from column.levels in multiindex dataframe #12822

Closed

astafan8 added a commit to Dominik-Vogel/Qcodes that referenced this issue May 9, 2019

Add proper fix for slicing multiindex dataframe

bfc1574

... in Dataset context manager example notebook. Based on this pandas-dev/pandas#3686.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

dataframe.drop(col,axis=1) does not drop column from column.levels in multiindex dataframe #3686

dataframe.drop(col,axis=1) does not drop column from column.levels in multiindex dataframe #3686

amol-desai commented May 22, 2013

cpcloud commented May 22, 2013

cpcloud commented May 22, 2013

amol-desai commented May 22, 2013

michaelaye commented May 23, 2013

cpcloud commented May 23, 2013

ghost commented Nov 30, 2013

jreback commented Feb 26, 2014

dataframe.drop(col,axis=1) does not drop column from column.levels in multiindex dataframe #3686

dataframe.drop(col,axis=1) does not drop column from column.levels in multiindex dataframe #3686

Comments

amol-desai commented May 22, 2013

cpcloud commented May 22, 2013

cpcloud commented May 22, 2013

amol-desai commented May 22, 2013

michaelaye commented May 23, 2013

cpcloud commented May 23, 2013

ghost commented Nov 30, 2013

jreback commented Feb 26, 2014