Whatsapp has an option to email the group conversation to yourself. I did that and now want to explore it in R. The problem is that it seems to have multiple separators which I don't know how to handle in R.
Here is what I tried:
Whatsapp可以選擇通過電子郵件發送群組對話給自己。我做到了,現在想在R中探索它。問題是它似乎有多個分隔符,我不知道如何處理R.這是我嘗試的:
library(readr)
library(dplyr)
> gf <- read_delim('df.txt', col_names = F, skip = 2, delim='\t')
Warning message:
15 problems parsing 'df.txt'. See problems(...) for more details.
> head(gf)
Source: local data frame [6 x 12]
X1 X2 X3 X4 X5 X6 X7 X8 X9 X10 X11 X12
1 9:14pm Mar 31 umair: Great NA NA NA NA NA NA NA
2 9:14pm Mar 31 umair: I am back NA NA NA NA NA NA NA
3 9:15pm Mar 31 umair: ?? NA NA NA NA NA NA NA
4 10:27pm Mar 31 umair: Kon kon zinda hay NA NA NA NA NA NA NA
5 10:49pm Mar 31 Kazim: Sab zinda hain ..... NA NA NA NA NA NA NA
6 10:50pm Mar 31 umair: Very good NA NA NA NA NA NA NA
Can you help me read this file so that the "sender:message" is separated into 2 columns? And the first 2 columns are read as separate columns as shown. Obviously I don't want columns X4 to X12.
你能幫我讀一下這個文件,以便將“sender:message”分成2列嗎?前兩列作為單獨的列讀取,如圖所示。顯然我不希望列X4到X12。
Here are the first few lines of the raw file:
以下是原始文件的前幾行:
9:14pm, Mar 31 - umair: Great
9:14pm, Mar 31 - umair: I am back
9:15pm, Mar 31 - umair: 👹
10:27pm, Mar 31 - umair: Kon kon zinda hay
10:49pm, Mar 31 - Kazim: Sab zinda hain .....
10:50pm, Mar 31 - umair: Very good
10:52pm, Mar 31 - umair: Abid agaya dobara?
10:54pm, Mar 31 - Kazim: Nai wo nai aya
10:54pm, Mar 31 - umair: Hmmmmmmmmm
This question is old, yet when I wanted to do the same thing, my google search lead me here. I figured it out and put it into an R package. Install and read in the data:
這個問題很老,但當我想做同樣的事情時,我的谷歌搜索引導我到這里。我想出來並把它放入R包中。安裝並讀入數據:
devtools::install_github("JBGruber/rwhatsapp")
library(rwhatsapp)
gf <- rwa_read("df.txt")
Or you can directly paste in the lines
或者你可以直接粘貼在線條中
> lines <- c(
"9:14pm, Mar 31 - umair: Great",
"9:14pm, Mar 31 - umair: I am back",
"9:15pm, Mar 31 - umair: ",
"10:27pm, Mar 31 - umair: Kon kon zinda hay",
"10:49pm, Mar 31 - Kazim: Sab zinda hain .....",
"10:50pm, Mar 31 - umair: Very good",
"10:52pm, Mar 31 - umair: Abid agaya dobara?",
"10:54pm, Mar 31 - Kazim: Nai wo nai aya",
"10:54pm, Mar 31 - umair: Hmmmmmmmmm"
)
> rwa_read(lines)
# A tibble: 9 x 3
time author text
<dttm> <fct> <chr>
1 2018-03-31 21:14:13 umair Great
2 2018-03-31 21:14:13 umair I am back
3 2018-03-31 21:15:13 umair " "
4 2018-03-31 22:27:13 umair Kon kon zinda hay
5 2018-03-31 22:49:13 Kazim Sab zinda hain .....
6 2018-03-31 22:50:13 umair Very good
7 2018-03-31 22:52:13 umair Abid agaya dobara?
8 2018-03-31 22:54:13 Kazim Nai wo nai aya
9 2018-03-31 22:54:13 umair Hmmmmmmmmm
本站翻译的文章,版权归属于本站,未经许可禁止转摘,转摘请注明本文地址:https://www.itdaan.com/blog/2015/05/05/725900819566a5fd6a6c0769c9c88b4a.html。