I cannot figure out why a simple function:
def to_integer(value):
if value == "":
return None
return int(value)
changes values from str to int only if there's no empty string "" in the dataframe, i.e. only if no value is to be returned as None.
If I go:
type(to_integer('1')) == int
returns True.
Now, using apply and to_integer with df1:
df1 = pd.DataFrame(['1', '2', '3'], columns=['integer'])
result = df1['integer'].apply(to_integer)
gives column of integers (np.int64).
But if I apply it to this df2:
df2 = pd.DataFrame(['1', '', '3'], columns=['integer'])
result = df2['integer'].apply(to_integer)
it returns a column of floats (np.float64).
Isn't it possible to have a dataframe with integers and None at the same time?
I use Python 3.3 and Pandas 0.12.