dataframe插入數據報錯SettingWithCopyWarning: A value is trying to be set on a copy of a slice from a


SettingWithCopyWarning 解決方案

場景

問題場景:我在讀取csv文件之后,因為要新增一個特征列並根據已有特征修改新增列的值,結果在修改的時候就碰到了SettingWithCopyWarning這個警告,花了很長時間才解決這個問題。

案例:

import pandas as pd
import numpy as np

aa = np.array([1, 0, 1, 0])
bb = pd.DataFrame(aa.T, columns=['one'])
print(bb)
   one
0    1
1    0
2    1
3    0
bb['two'] = 0
print(bb)
   one  two
0    1    0
1    0    0
2    1    0
3    0    0

按條件修改新列再輸出就報錯了:

for i in range(bb.shape[0]):
    if bb['one'][i] == 0:
        bb['two'][i] = 1
print(bb)

C:/PycharmProjects/NaiveBayesProduct/pandas/try_index.py:22: SettingWithCopyWarning: 
A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/indexing.html#indexing-view-versus-copy
  bb['two'][i] = 1
   one  two
0    1    0
1    0    1
2    1    0
3    0    1

解決方案

正確方案應該是生成好正確的數組再插入dataframe中。下面我把上面的例子用正確地方法再重新生成一遍。

import pandas as pd
import numpy as np

aa = np.array([1, 0, 1, 0])
bb = pd.DataFrame(aa.T, columns=['one'])
# 生成一個ndarray,裝要插入的值
two = np.zeros(bb.shape[0])
# 按條件修改two
for i in range(bb.shape[0]):
    if bb['one'][i] == 0:
        two[i] = 1
# 完成后將two插入dataframe中
bb.insert(1,'two', two)  
 #insert 三個參數,插到第幾列,該列列名,如果是bb.insert(0,'two', two),插入到第一列,
print(bb)

   one  two
0    1  0.0
1    0  1.0
2    1  0.0
3    0  1.0

個人代碼

個人案例代碼:在進行利用朴素貝葉斯網絡進行對評論進行分類的過程中,正向定義為1,負向定義為0.插入評論分析結果時報錯

comm_data=pd.read_csv("C:\\Users\\lenovo\\Desktop\\comm\\new_data.csv",encoding="utf-8")
        # comm_data=new_data
        print(comm_data.head(5))
        comm_data["classify"]="#"
        for c in range(len(comm_data)):
            classify=testingNB(comm_data["content"][c])
            # print(classify)
            comm_data["classify"][c]=classify
        comm_data.to_csv("C:\\Users\\lenovo\\Desktop\\comm\\comm_data.csv")

出現報錯:

D:/office3/python/python_py/compare/score_variance/get_data/web5_data_mg.py:161: SettingWithCopyWarning: 
A value is trying to be set on a copy of a slice from a DataFrame

See the caveats in the documentation: http://pandas.pydata.org/pandas-docs/stable/indexing.html#indexing-view-versus-copy
  comm_data["classify"][c]=classify

解決方案;

    comm_data=pd.read_csv("C:\\Users\\lenovo\\Desktop\\comm\\new_data.csv",encoding="utf-8")
        # comm_data=new_data
        print(comm_data.head(5))
        # comm_data["classify"]="#"
        classify= np.zeros(comm_data.shape[0])
        for c in range(len(comm_data)):
            classifynb=testingNB(comm_data["content"][c])
            # print(classify)
            # comm_data["classify"][c]=classify
            classify[c]=classifynb
        comm_data(0,'classify', classify)
        comm_data.to_csv("C:\\Users\\lenovo\\Desktop\\comm\\comm_data.csv")

這樣問題就解決了。


注意!

本站转载的文章为个人学习借鉴使用,本站对版权不负任何法律责任。如果侵犯了您的隐私权益,请联系我们删除。



 
粤ICP备14056181号  © 2014-2021 ITdaan.com