Say df is a pandas dataframe.
df.loc[]only accepts namesdf.iloc[]only accepts integers (actual placements)df.ix[]accepts both names and integers:
When referencing rows, df.ix[row_idx, ] only wants to be given names. e.g.
df = pd.DataFrame({'a' : ['one', 'two', 'three','four', 'five', 'six'],
'1' : np.arange(6)})
df = df.ix[2:6]
print(df)
1 a
2 2 three
3 3 four
4 4 five
5 5 six
df.ix[0, 'a']
throws an error, it doesn't give return 'two'.
When referencing columns, iloc is prefers integers, not names. e.g.
df.ix[2, 1]
returns 'three', not 2. (Although df.idx[2, '1'] does return 2).
Oddly, I'd like the exact opposite functionality. Usually my column names are very meaningful, so in my code I reference them directly. But due to a lot of observation cleaning, the row names in my pandas data frames don't usually correspond to range(len(df)).
I realize I can use:
df.iloc[0].loc['a'] # returns three
But it seems ugly! Does anyone know of a better way to do this, so that the code would look like this?
df.foo[0, 'a'] # returns three
In fact, is it possible to add on my own new method to pandas.core.frame.DataFrames, so e.g.
df.idx(rows, cols) is in fact df.iloc[rows].loc[cols]?