当前位置: 代码迷 >> python >> 在数据帧中转换
  详细解决方案

在数据帧中转换

热度:29   发布时间:2023-07-16 10:44:19.0

我有一个像-

body = ['"name","number"', '"Dudh & Pani Dudh & Pani Dudh Wala","+91 70148 05126"', '"Deepanshu Paymate","+91 72082 94015"']

df = pd.DataFrame(body)
print(df)

输出是 -

                                                   0
0                                    "name","number"
1  "Dudh & Pani Dudh & Pani Dudh Wala","+91 70148...
2              "Deepanshu Paymate","+91 72082 94015"

但我想要两列,第一列将被命名,第二列将是数字。

使用 list comprehension with split与所有没有 list 的第一个值的数据和列 split 的第一个值list

df = pd.DataFrame([x .split(',') for x in body[1:]], columns=body[0].split(','))
print(df)
                                "name"           "number"
0  "Dudh & Pani Dudh & Pani Dudh Wala"  "+91 70148 05126"
1                  "Deepanshu Paymate"  "+91 72082 94015"

如果想要带""

df = pd.DataFrame([[y.strip('"') for y in x.split(',')] for x in body[1:]], 
                    columns=[y.strip('"') for y in body[0].split(',')])
print(df)
                                name           number
0  Dudh & Pani Dudh & Pani Dudh Wala  +91 70148 05126
1                  Deepanshu Paymate  +91 72082 94015

replace另一个想法:

body = [x.replace("'","").replace('"','') for x in body]
df = pd.DataFrame([x.split(',') for x in body[1:]], columns=body[0].split(','))
print(df)

                                name           number
0  Dudh & Pani Dudh & Pani Dudh Wala  +91 70148 05126
1                  Deepanshu Paymate  +91 72082 94015

从您的字符串列表中,我们(通过列表理解)制作了一个 [name, number] 对列表的列表,然后从这个新列表中我们创建了您的数据框:

 In[16]: new_body = [element.split(",")  for element in body]
 In[17]: df = pd.DataFrame(new_body)

 In[18]: df
 Out[18]:
                                      0                  1
 0                               "name"           "number"
 1  "Dudh & Pani Dudh & Pani Dudh Wala"  "+91 70148 05126"
 2                  "Deepanshu Paymate"  "+91 72082 94015"

我找到了这种方式 -

body = ['"name","number"', '"Dudh & Pani Dudh & Pani Dudh Wala","+91 70148 05126"', '"Deepanshu Paymate","+91 72082 94015"']

data = pd.DataFrame(body)

# new data frame with split value columns 
data = data[0].str.split(",", n = 1, expand = True) 

new_header = data.iloc[0] #grab the first row for the header
data = data[1:] #take the data less the header row
data.columns = new_header #set the header row as the df header

print(data)

输出是 -

0                               "name"           "number"
1  "Dudh & Pani Dudh & Pani Dudh Wala"  "+91 70148 05126"
2                  "Deepanshu Paymate"  "+91 72082 94015"

代替

body = ['"name","number"', '"Dudh & Pani Dudh & Pani Dudh Wala","+91 70148 05126"', '"Deepanshu Paymate","+91 72082 94015"']

body = [["name","number"], ["Dudh & Pani Dudh & Pani Dudh Wala","+91 70148 05126"], ["Deepanshu Paymate","+91 72082 94015"]]

即而不是用撇号包围对: '...' ,用方括号包围它们: [...]

然后用df = pd.DataFrame(body)创建的数据df = pd.DataFrame(body)将是

 0 1 0 name number 1 Dudh & Pani Dudh & Pani Dudh Wala +91 70148 05126 2 Deepanshu Paymate +91 72082 94015
  相关解决方案