J'ai les dataframes suivantes, appelées "key" et "yr_df":

key
Out[63]: 
         No_Households
NUTS_ID               
DEF01         0.001191
DEF02         0.003404
DEF03         0.002903
DEF04         0.001009
DEF05         0.001641
               ...
DEG0I         0.001468
DEG0J         0.001042
DEG0K         0.001062
DEG0L         0.001358
DEG0M         0.001289

[402 rows x 1 columns]

yr_df
Out[62]: 
       2010      2011      2012  ...      2017      2018      2019
0  40301000  39509000  39707000  ...  41304000  41378000  41506000

[1 rows x 10 columns]

Je veux multiplier chaque élément de "clé" par chacune des valeurs de "yr_df". Autrement dit, multipliez DEF01, DEF02, DEF03, etc. par la valeur de 2010 (40301000), puis par la valeur de 2011 (39509000), etc. Ces résultats seraient stockés dans une nouvelle trame de données avec la structure:

          2010  2011  2012...
NUTS_ID
DEF01
DEF02
DEF03
...

Quelle serait la meilleure façon de faire cela? Merci d'avance pour votre aide.

1
JavierSando 30 nov. 2020 à 13:29

2 réponses

Meilleure réponse

Utilisez la diffusion numpy en convertissant les deux Series en tableaux numpy:

a = key['No_Households'].to_numpy()[:, None] * yr_df.iloc[0].to_numpy()

df = pd.DataFrame(a, index=key.index, columns=yr_df.columns)
print (df)
             2010        2011        2012        2017        2018        2019
DEF01   47998.491   47055.219   47291.037   49193.064   49281.198   49433.646
DEF02  137184.604  134488.636  135162.628  140598.816  140850.712  141286.424
DEF03  116993.803  114694.627  115269.421  119905.512  120120.334  120491.918
DEF04   40663.709   39864.581   40064.363   41675.736   41750.402   41879.554
DEF05   66133.941   64834.269   65159.187   67779.864   67901.298   68111.346
DEG0I   59161.868   57999.212   58289.876   60634.272   60742.904   60930.808
DEG0J   41993.642   41168.378   41374.694   43038.768   43115.876   43249.252
DEG0K   42799.662   41958.558   42168.834   43864.848   43943.436   44079.372
DEG0L   54728.758   53653.222   53922.106   56090.832   56191.324   56365.148
DEG0M   51947.989   50927.101   51182.323   53240.856   53336.242   53501.234
1
jezrael 30 nov. 2020 à 10:38

Prenons le produit scalaire des deux matrices:

import pandas
import random

# Create some test data
df_1 = pandas.DataFrame([{"NUTS_ID": f"DEF{i}", "No_households": random.random()} for i in range(400)]).set_index("NUTS_ID")
df_2 = pandas.DataFrame([{"year": year, "value": random.randint(39707000, 41506000)} for year in range(2010, 2020)]).set_index("year").T

# Convert the dataframes to numpy arrays
df_1_numpy = df_1.to_numpy()
df_2_numpy = df_2.to_numpy()

# Take the product of the 2 matrices
product = df_1_numpy.dot(df_2_numpy)

# Recreate the dataframe
df = pandas.DataFrame(product, columns=df_2.columns, index=df_1.index)
print(df)

Production:

year             2010          2011          2012          2013          2014  \
NUTS_ID                                                                         
DEF0     1.842382e+07  1.790878e+07  1.776382e+07  1.820333e+07  1.836708e+07   
DEF1     3.463805e+07  3.366974e+07  3.339721e+07  3.422352e+07  3.453139e+07   
DEF2     2.448049e+07  2.379614e+07  2.360353e+07  2.418752e+07  2.440511e+07   
DEF3     1.909173e+06  1.855802e+06  1.840781e+06  1.886325e+06  1.903294e+06   
DEF4     2.403505e+07  2.336315e+07  2.317404e+07  2.374741e+07  2.396104e+07   
...               ...           ...           ...           ...           ...   
DEF395   8.322933e+06  8.090264e+06  8.024780e+06  8.223327e+06  8.297304e+06   
DEF396   7.356079e+06  7.150439e+06  7.092562e+06  7.268044e+06  7.333427e+06   
DEF397   3.778848e+07  3.673210e+07  3.643478e+07  3.733624e+07  3.767212e+07   
DEF398   6.353758e+06  6.176138e+06  6.126147e+06  6.277719e+06  6.334193e+06   
DEF399   3.601888e+07  3.501197e+07  3.472857e+07  3.558782e+07  3.590797e+07   

year             2015          2016          2017          2018          2019  
NUTS_ID                                                                        
DEF0     1.774076e+07  1.852360e+07  1.835540e+07  1.822035e+07  1.831470e+07  
DEF1     3.335387e+07  3.482565e+07  3.450942e+07  3.425552e+07  3.443291e+07  
DEF2     2.357289e+07  2.461308e+07  2.438958e+07  2.421014e+07  2.433551e+07  
DEF3     1.838392e+06  1.919513e+06  1.902083e+06  1.888089e+06  1.897866e+06  
DEF4     2.314396e+07  2.416522e+07  2.394579e+07  2.376961e+07  2.389270e+07  
...               ...           ...           ...           ...           ...  
DEF395   8.014365e+06  8.368009e+06  8.292024e+06  8.231016e+06  8.273640e+06  
DEF396   7.083356e+06  7.395919e+06  7.328761e+06  7.274840e+06  7.312512e+06  
DEF397   3.638749e+07  3.799314e+07  3.764815e+07  3.737115e+07  3.756468e+07  
DEF398   6.118196e+06  6.388169e+06  6.330162e+06  6.283589e+06  6.316128e+06  
DEF399   3.468350e+07  3.621395e+07  3.588512e+07  3.562109e+07  3.580556e+07  

[400 rows x 10 columns]
2
Gijs Wobben 30 nov. 2020 à 10:44