Now I have a DataFrame TrainLabelModified, a 120538 x 3 DataFrame as below:
user_id video_id operation_times
0 0 10 3
1 0 15 3
2 0 19 7
3 0 21 3
4 0 28 5
5 0 30 9
6 0 39 3
7 0 40 3
8 0 45 3
9 0 47 2
10 0 58 3
... ... ... ...
120526 5048 7 1
120527 5048 37 2
120528 5048 40 12
120529 5048 49 2
120530 5048 52 6
120531 5049 3 49
120532 5049 25 14
120533 5049 35 21
120534 5049 36 1
120535 5049 37 4
120536 5049 46 25
120537 5049 53 10
120538 5049 56 5
And I want a new DataFrame TrainDataFinal, a 5050 x 64 DataFrame like this:
user_id video_0_operation_times v1_ot v2_ot ... v61_ot v62_dt
0 0 ... ... ... ... ...
1 1 ... ... ... ... ...
2 2 ... ... ... ... ...
3 3 ... ... ... ... ...
4 4 ... ... ... ... ...
5 5 ... ... ... ... ...
... ... ... ... ... ... ...
5048 5048 ... ... ... ... ...
5049 5049 ... ... ... ... ...
For example, for user 0 in sample data, his/her v(n)_ot is: v10_ot = 3, v15_ot = 3, v19_ot = 7, ... , v58_ot = 3 and other v(n)_ot = 0.
My idea is to create a TrainDataFinal = np.zeros([5050,64]) and assign value to it one by one according to TrainDataModified. But since the DataFrame is quite huge, it might cost too much time. Is there any solution to this issue?