问题描述
我有一个像-
body = ['"name","number"', '"Dudh & Pani Dudh & Pani Dudh Wala","+91 70148 05126"', '"Deepanshu Paymate","+91 72082 94015"']
df = pd.DataFrame(body)
print(df)
输出是 -
0
0 "name","number"
1 "Dudh & Pani Dudh & Pani Dudh Wala","+91 70148...
2 "Deepanshu Paymate","+91 72082 94015"
但我想要两列,第一列将被命名,第二列将是数字。
1楼
使用 list comprehension with split
与所有没有 list 的第一个值的数据和列 split 的第一个值list
:
df = pd.DataFrame([x .split(',') for x in body[1:]], columns=body[0].split(','))
print(df)
"name" "number"
0 "Dudh & Pani Dudh & Pani Dudh Wala" "+91 70148 05126"
1 "Deepanshu Paymate" "+91 72082 94015"
如果想要带""
:
df = pd.DataFrame([[y.strip('"') for y in x.split(',')] for x in body[1:]],
columns=[y.strip('"') for y in body[0].split(',')])
print(df)
name number
0 Dudh & Pani Dudh & Pani Dudh Wala +91 70148 05126
1 Deepanshu Paymate +91 72082 94015
replace
另一个想法:
body = [x.replace("'","").replace('"','') for x in body]
df = pd.DataFrame([x.split(',') for x in body[1:]], columns=body[0].split(','))
print(df)
name number
0 Dudh & Pani Dudh & Pani Dudh Wala +91 70148 05126
1 Deepanshu Paymate +91 72082 94015
2楼
从您的字符串列表中,我们(通过列表理解)制作了一个 [name, number] 对列表的列表,然后从这个新列表中我们创建了您的数据框:
In[16]: new_body = [element.split(",") for element in body]
In[17]: df = pd.DataFrame(new_body)
In[18]: df
Out[18]:
0 1
0 "name" "number"
1 "Dudh & Pani Dudh & Pani Dudh Wala" "+91 70148 05126"
2 "Deepanshu Paymate" "+91 72082 94015"
3楼
我找到了这种方式 -
body = ['"name","number"', '"Dudh & Pani Dudh & Pani Dudh Wala","+91 70148 05126"', '"Deepanshu Paymate","+91 72082 94015"']
data = pd.DataFrame(body)
# new data frame with split value columns
data = data[0].str.split(",", n = 1, expand = True)
new_header = data.iloc[0] #grab the first row for the header
data = data[1:] #take the data less the header row
data.columns = new_header #set the header row as the df header
print(data)
输出是 -
0 "name" "number"
1 "Dudh & Pani Dudh & Pani Dudh Wala" "+91 70148 05126"
2 "Deepanshu Paymate" "+91 72082 94015"
4楼
代替
body = ['"name","number"', '"Dudh & Pani Dudh & Pani Dudh Wala","+91 70148 05126"', '"Deepanshu Paymate","+91 72082 94015"']
用
body = [["name","number"], ["Dudh & Pani Dudh & Pani Dudh Wala","+91 70148 05126"], ["Deepanshu Paymate","+91 72082 94015"]]
即而不是用撇号包围对: '...'
,用方括号包围它们: [...]
。
然后用df = pd.DataFrame(body)
创建的数据df = pd.DataFrame(body)
将是
0 1 0 name number 1 Dudh & Pani Dudh & Pani Dudh Wala +91 70148 05126 2 Deepanshu Paymate +91 72082 94015