Creating array from a dataframe taking one column value as a reference

python arrays for-loop dataframe

94 观看


95 作者的声誉

I am dealing with this issue since some days ago, but I couldn´t find an answer. Hope you can help me.

This is my dataframe:

   Date              Attribute       Quantity
0 2017-12-14           large          -39
0 2017-12-15           large          -80
1 2017-12-15           large          -30
2 2017-12-14           short          -15
2 2017-12-15           short          -100
4 2017-12-15           short          -10
1 2017-12-15           short           20
3 2017-12-15           short           60
3 2017-12-15            big            80
5 2017-12-15            big           104 

What I want to do? I would like to calculate XIRR for each Attribute ítem. For this I require Date and Quantity (as an array), but based on each Attribute item listed in the second column. For example, given large, I would like to extract Dates and quantities (as an array) for large.

Given that, I think my best choice is to create specific arrays based on the Attibute column and then execute the aforementioned function (please, let me know if you consider another approach to this problem). So, I generated one array df1= df[['Date','Quantity']].as_matrix() which produce

[[Timestamp('2017-12-14 00:00:00') -39]
 [Timestamp('2017-12-15 00:00:00') -80]
 [Timestamp('2017-12-15 00:00:00') -30]
 [Timestamp('2017-12-14 00:00:00') -15]
 [Timestamp('2017-12-15 00:00:00') -100]
 [Timestamp('2017-12-15 00:00:00') -10]
 [Timestamp('2017-12-15 00:00:00') -20]
 [Timestamp('2017-12-15 00:00:00') 60]
 [Timestamp('2017-12-15 00:00:00') -80]
 [Timestamp('2017-12-15 00:00:00') 104]]

As you can see this array includes all the attributes, but I would like to get something like For / each function according to each attibute in column Attribute. How can I do this? Is this the best approach/alternative to my final goal?

Any help would be highly appreciated.

PD: I should mention that the function I would like to use works over attribute as a group (because it requieres dates and quantity, all together). It works like gruopby.


作者: Newbie 的来源 发布者: 2017 年 12 月 27 日

回应 1


1208 作者的声誉

Consider applying a function to each row of the DataFrame:

def row_func(row):
    if row['Atribute'] == 'large':
        return row['quantity']

df['new_column'] = df.apply(row_func, axis=1)
作者: it's-yer-boy-chet 发布者: 2017 年 12 月 27 日