Merging two datasets only by first row in R
40 观看
1回复
662 作者的声誉
This question already has an answer here:
I need to merge two datasets, but in the second one, there can be duplicate id, for example several id of 1,1,1. If there are duplicate id, how to merge to the very first row of them?
To be more clear, here's a reproducible example:
df1
structure(list(id = 1:2, y = 10:11), .Names = c("id", "y"), class = "data.frame", row.names = c(NA,
-2L))
df2
structure(list(id = c(1L, 1L, 1L, 2L), x1 = 435:438, x2 = c(435L,
436L, 436L, 438L), x3 = c(435L, 436L, 436L, 438L)), .Names = c("id",
"x1", "x2", "x3"), class = "data.frame", row.names = c(NA, -4L
))
Eaxample: In output i expect this format
id y x1 x2 x3
1 10 435 435 435
2 11 438 438 438
I.E. 2 and 3 rows (1 id) do not participate in merging.
作者: D.Joe 的来源 发布者: 2017 年 12 月 27 日回应 1
1像
1545 作者的声誉
You can do it using data.table
. You can retain only first occurrence where id == 1
from your second data set and then merge
both the data sets.
Here is the solution:
library(data.table)
setDT(df2)
df2[, idx := 1:.N, by = id]
df2 <- df2[idx == 1, ]
df2[, idx := NULL]
output <- merge(df1, df2, by = "id")
output
It'll give you your desired output:
id y x1 x2 x3
1 1 10 435 435 435
2 2 11 438 438 438
作者: samadhi
发布者: 2017 年 12 月 27 日
来自类别的问题 :
- r 如何访问向量中的最后一个值?
- r R的优化包
- r R是否有类似引用的运算符,如Perl的qw()?
- r R中没有标题/标签的图
- r 计算移动平均线
- r Emacs ESS模式 - 评论区域的标签
- r 将数据从多行转换为多列
- r 测试向量是否包含给定元素
- r 查找向量中多个元素的所有位置
- r 用于访问列表或数据框元素的方括号[]和双括号[[]]之间的区别
- merge 在MATLAB中组合两个结构有哪些有效的方法?
- merge 如何在单个表达式中合并两个词典?
- merge TFS中的跨分支合并?
- merge 将DLL嵌入已编译的可执行文件中
- merge 如何在Perl中组合哈希?
- merge 如何将更改区分/合并到部署项目文件
- merge 尽管存在差异,Git合并报告“已经是最新的”
- merge Merged ObservableCollection
- merge How to merge a specific commit in Git
- merge PHP-将两个数组(相同长度)合并为一个关联?