iT邦幫忙

0

請教python怎麼把list轉換成一個row,n個colomn的dataframe

如題,小白一個平時只用R語言,不熟悉python。最難自學網綠爬蟲遇到了一個基礎問題,在網上搜索了很久也沒有解決。希望有人能提出意見
這是我的list,值網路爬蟲爬到的:我希望能把他轉換成一個row的dataframe,方便我等會能和其他row合併,現在主要問題有兩個,如何把list轉換成一個row的dataframe?然後如何用類似R裡面Rbind的功能把好幾個list轉換好的dataframe合併?

row = ['Gr',
 'http://www.purriodictableofcats.com/images/b-grumpycat.jpg',
 'Female',
 'Real Name: Tardar Sauce',
 'Hit the internet: 2012',
 "Interesting Facts: Being both grumpy and adorableNamed as MSNBC's 2012 Most Influential CatWon Buzzfeed's Meme of the Year Award in 2013Published 2 books and a wall calendarAppeared on: Today Show, Good Morning America, CBS Evening News, American Idol and many more",
 'https://www.facebook.com/TheOfficialGrumpyCat',
 'https://twitter.com/RealGrumpyCat',
 'http://instagram.com/realgrumpycat/']

我已經試過了好幾個方法,比如transpose...以下是我最後一次嘗試的結果,和錯誤提示:

pd.DataFrame(row, columns = ["id","img","sex","real_name","year","fact","link1","link2","link3"])

ValueError: Shape of passed values is (9, 1), indices imply (9, 9)
應該還沒成功轉化目前沒有寫到類似rbind的公式,希望有人呢介紹一下工嗯呢該相同的公式,謝謝。

2
froce
iT邦大師 6 級 ‧ 2019-06-18 08:28:35
最佳解答
0
japhenchen
iT邦新手 4 級 ‧ 2019-06-18 08:49:38

用dict,以下瞎做個示例,實際依你的狀況自己修改

#!/usr/bin/env python3
import os

mylist = [["00001","John","Paris","12345"] , ["00002","May","Taiwan","66666"] , ["00003","Toyota","Japan","22222"] ]
mydict = dict()

for ml in mylist:  #把LIST轉成DICT
    mydict.update({ ml[0]:{"name":ml[1],"city":ml[2]} ,"tel":ml[3]} )

print(mydict["00001"]["name"]) # 印出00001號的名字

#最好先檢查這個00001是不是存在,不然會跳出KEY NOT EXIST錯誤
#檢查很簡單 if "00001" in mydict:
#幾百萬行的DICT在合理的CPU下,找到其中一項,時間會在0.幾毫秒計

合併dict ........

mydict.update(yourdict)

dict 用gzip壓縮後儲存,我建議再搭jsonpickle套件

        if mydict is not None:
            with gzip.open(self.dictgz,"wb") as gz :
                sp = jsonpickle.encode(mydict)
                bsp = sp.encode()
                gz.write(bsp)

從磁碟機取回

        if os.path.exists("/home/user001/mydict.gz"):
            with gzip.open("/home/user001/mydict.gz","rb") as gz :
                gzdata = gz.read()
                bsp=gzdata.decode()
                if len(bsp)>0:
                    mydict=jsonpickle.decode(bsp)
0
ccutmis
iT邦研究生 4 級 ‧ 2019-06-18 08:55:29
import pandas as pd

AA = ['Gr',
 'http://www.purriodictableofcats.com/images/b-grumpycat.jpg',
 'Female',
 'Real Name: Tardar Sauce',
 'Hit the internet: 2012',
 "Interesting Facts: Being both grumpy and adorableNamed",
 'https://www.facebook.com/TheOfficialGrumpyCat',
 'https://twitter.com/RealGrumpyCat',
 'http://instagram.com/realgrumpycat/']

BB={'id':[AA[0]],'img':[AA[1]],'sex':[AA[2]],'real_name':[AA[3]],'year':[AA[4]],'fact':[AA[5]],'link1':[AA[6]],'link2':[AA[7]],'link3':[AA[8]]}
data =pd.DataFrame(BB)
print(data)

參考看看...

我要發表回答

立即登入回答