Using `strsplit` and `grep`, first I set made an object `para` which was your paragraph.
toMatch <- c("Martin Luther", "Paul", "Melanchthon")
unlist(strsplit(para,split="\\."))[grep(paste(toMatch, collapse="|"),unlist(strsplit(para,split="\\.")))]
> unlist(strsplit(para,split="\\."))[grep(paste(toMatch, collapse="|"),unlist(strsplit(para,split="\\.")))]
[1] "Opposed as a reformer at Tübingen, he accepted a call to the University of Wittenberg by Martin Luther, recommended by his great-uncle Johann Reuchlin"
[2] " Melanchthon became professor of the Greek language in Wittenberg at the age of 21"
[3] " He studied the Scripture, especially of Paul, and Evangelical doctrine"
[4] " Johann Eck having attacked his views, Melanchthon replied based on the authority of Scripture in his Defensio contra Johannem Eckium"
Albo trochę czystsze:
sentences<-unlist(strsplit(para,split="\\."))
sentences[grep(paste(toMatch, collapse="|"),sentences)]
Jeśli szukasz zdania, że każda osoba jest w postaci oddzielnych powraca następnie:
toMatch <- c("Martin Luther", "Paul", "Melanchthon")
sentences<-unlist(strsplit(para,split="\\."))
foo<-function(Match){sentences[grep(Match,sentences)]}
lapply(toMatch,foo)
[[1]]
[1] "Opposed as a reformer at Tübingen, he accepted a call to the University of Wittenberg by Martin Luther, recommended by his great-uncle Johann Reuchlin"
[[2]]
[1] " He studied the Scripture, especially of Paul, and Evangelical doctrine"
[[3]]
[1] " Melanchthon became professor of the Greek language in Wittenberg at the age of 21"
[2] " Johann Eck having attacked his views, Melanchthon replied based on the authority of Scripture in his Defensio contra Johannem Eckium"
Edit 3: Aby dodać nazwa każdej osoby, zrób coś prostego, takiego jak:
foo<-function(Match){c(Match,sentences[grep(Match,sentences)])}
EDIT 4:
A jeśli chciał znaleźć zdań, które miały wiele osób/miejsc/rzeczy (słowa), a następnie po prostu dodać argument dla tych dwóch, takich jak:
toMatch <- c("Martin Luther", "Paul", "Melanchthon","(?=.*Melanchthon)(?=.*Scripture)")
i zmienić perl
do TRUE
:
foo<-function(Match){c(Match,sentences[grep(Match,sentences,perl = T)])}
> lapply(toMatch,foo)
[[1]]
[1] "Martin Luther"
[2] "Opposed as a reformer at Tübingen, he accepted a call to the University of Wittenberg by Martin Luther, recommended by his great-uncle Johann Reuchlin"
[[2]]
[1] "Paul"
[2] " He studied the Scripture, especially of Paul, and Evangelical doctrine"
[[3]]
[1] "Melanchthon"
[2] " Melanchthon became professor of the Greek language in Wittenberg at the age of 21"
[3] " Johann Eck having attacked his views, Melanchthon replied based on the authority of Scripture in his Defensio contra Johannem Eckium"
[[4]]
[1] "(?=.*Melanchthon)(?=.*Scripture)"
[2] " Johann Eck having attacked his views, Melanchthon replied based on the authority of Scripture in his Defensio contra Johannem Eckium"
EDIT 5: odpowiadając na inne pytanie:
Podano:
sentenceR<-"Opposed as a reformer at [[Tübingen]], he accepted a call to the University of [[Wittenberg]] by [[Martin Luther]], recommended by his great-uncle [[Johann Reuchlin]]"
gsub("\\[\\[|\\]\\]", "", regmatches(sentenceR, gregexpr("\\[\\[.*?\\]\\]", sentenceR))[[1]])
Poda ci słowa w podwójnych nawiasach.
> gsub("\\[\\[|\\]\\]", "", regmatches(sentenceR, gregexpr("\\[\\[.*?\\]\\]", sentenceR))[[1]])
[1] "Tübingen" "Wittenberg" "Martin Luther" "Johann Reuchlin"
Wiele thx, ale widzę, że na pierwszym i 4. zdań, istnieją dwa nazwiska osoby odpowiednio . Jeśli dodaję taką nazwę, jak "Johann Eck" lub "Johann Reuchlin" do "toMatch" i uruchomę powyższy kod, nadal otrzymam cztery zdania wyjściowe. Moje nowe pytanie brzmi: jak mogę uzyskać zdanie każdej osoby (zachodzące na siebie)? – hui
Nie do końca rozumiem. Czy pytasz o a) tylko zdania, które mają w nazwie wszystkie osoby, lub b) oddzielny zwrot dla każdego indywidualnego nazwiska (te zdania, które zawierają w sobie Marcina Lutra, a następnie wszystkie zdania, które mają w sobie Pawul itd.)? –
@hui daj mi znać, jeśli nowy kod odpowie na twoje pytanie –