python - Pandas Groupby and create new column with custom values

Folks,

I've searched StackOverflow for my use-case but haven't been able to find anything useful. If you feel this problem is already solved, please point to the appropriate question.

Use-case.

I have the following data-frame.

  Maturity,Periods  
  0.5,2   
  0.5,2   
  1.0,3  
  1.0,3   
  1.0,3  

As you can see, the maturity column is repeated based on the number in the periods column. Now what I want to accomplish is create a new column which will have all 0s except 1 value for each grouped maturity. So expected dataframe is something like this

  Maturity,Periods,CP   
  0.5,2,0  
  0.5,2,1   
  1.0,3,0    
  1.0,3,0   
  1.0,3,1  

As you can see in the expected dataframe, the number of 0s in the CP column is 1 less than the value in the Periods column and the remaining value is 1.

I tried the below pandas groupby operation but it fails.

new_df['CP'] = new_df.groupby(['Maturity'])['Periods'].apply(lambda x: np.zeros((x-1, 1)) + np.array([1.0])).reset_index()

Can somebody point out where am I going wrong?

UPDATED EDIT:

As a follow-up to the above question, how would the below approach be solved using Pandas' operations?

Using this above dataframe, I want to create new column but the expected output is something like this:

Maturity,Periods,CP,TimeCF  
0.5,2,0,0.5
0.5,2,1,0.5

1.0,3,0,0.5
1.0,3,0,1.0
1.0,3,1,1.0

1.5,4,0,0.5
1.5,4,0,1.0
1.5,4,0,1.5
1.5,4,1,1.5

The new column of TimeCF will have values of time of the cash flows (considering semi-annual cash flows of the bond)

1 Answer

  1. Glen- Reply

    2019-11-14

    Doesn't seem like you need a groupby here... try this:

    df['CP'] = 0
    df.loc[df['Maturity'].ne(df['Maturity'].shift(-1)), 'CP'] = 1
    
    print(df)
       Maturity  Periods  CP
    0       0.5        2   0
    1       0.5        2   1
    2       1.0        3   0
    3       1.0        3   0
    4       1.0        3   1
    

    If groupby is unavoidable, you can use it in a similar fashion as before:

    df['CP'] = 0
    df.loc[df.groupby('Maturity').apply(lambda x: x.index[-1]), 'CP'] = 1
    
    print(df)
       Maturity  Periods  CP
    0       0.5        2   0
    1       0.5        2   1
    2       1.0        3   0
    3       1.0        3   0
    4       1.0        3   1
    

Leave a Reply

Your email address will not be published. Required fields are marked *

You can use these HTML tags and attributes <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>