-
-
Notifications
You must be signed in to change notification settings - Fork 18.4k
dataframe.drop(col,axis=1) does not drop column from column.levels in multiindex dataframe #3686
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
not sure what you mean by WAR here. |
I can confirm something similar, this might have something to do with the rebinding of randint = np.random.randint
columns = MultiIndex.from_tuples([('gspc', 'adj_close'),
('pf', 'adj_close'),
('aapl', 'adj_close'),
('nvda', 'adj_close')])
n = 50
x = np.random.randn(n) * 31.563 + 1621.127
y = np.random.randn(n) * 181.441 + 4442.121
z = np.random.randn(n) * 31.563 + 1621.127
w = np.random.randn(n) * 181.441 + 4442.121
data = DataFrame(np.column_stack((x, y, z, w)), columns=columns)
stocks = {'nvda': randint(100, 1000), 'gspc': randint(100, 1000), 'aapl': randint(100, 1000)}
data['pf', 'adj_close'] = np.zeros(n)
for ticker, nshares in stocks.iteritems():
data['pf', 'adj_close'] += data[ticker, 'adj_close'] * nshares
data = data.drop(ticker, axis=1, level=0) |
WAR = workaround. Unconsciously threw that in. Sorry. |
That's a long standing issue: #2770 |
It seems to me that if you are explicitly dropping levels then the index should be recomputed since that's what is being requested. As a corner case storing a huge multi index while only storing a single column would be inefficient. |
re #2770 (comment), this is not a bug (removed label), The comment about wasting memory is relavent, but since the levels should be shared with the @amol-desai, using that, you can reconstruct the frame to "squeeze" out unused level factors. |
closing as not a bug |
... in Dataset context manager example notebook. Based on this pandas-dev/pandas#3686.
I have a multiindex dataframe from which I am dropping columns using df.drop(col,axis=1). Then, I am looking through column.levels[0] and doing some operations on all the columns. However, when I try to do this, pandas looks for the removed column since it is not removed from column.levels. Is this a bug? Is there a WAR?
Here is the df I am working with:
Here is how I am dropping the columns:
Here is where the issue is:
Method used to generate DF as requested in the comment:
tickers is a list of stock ticker strings.
stock is an object that has a ticker property among others.
portfolio is an object that is a collection of stocks.
The text was updated successfully, but these errors were encountered: