Your approach is a good start!
my.DT <- myDT[,.(Type=Type, number_of_offices=nrow(my.DT[my.DT$Type=="office",]), number_of_apartments=nrow(my.DT$Type=="apparment",], by="Property")
The brackets don't match, I assume there was some copy-paste error. These can efficiently avoided by following the guidelines on how to create a great reproducible example & reprex package. The code in your question could then look something like this:
library(data.table)
#> Warning: package 'data.table' was built under R version 3.6.2
rooms_dt <- fread(text =
"Property Type
1 apartment
1 office
2 office
2 office
3 apartment
3 apartment
3 office
")
rooms_dt[, .(
Type=Type,
number_of_offices=nrow(rooms_dt[rooms_dt$Type=="office",]),
number_of_apartments=nrow(rooms_dt[rooms_dt$Type=="apartment",])
), by="Property"]
#> Property Type number_of_offices number_of_apartments
#> 1: 1 apartment 4 3
#> 2: 1 office 4 3
#> 3: 2 office 4 3
#> 4: 2 office 4 3
#> 5: 3 apartment 4 3
#> 6: 3 apartment 4 3
#> 7: 3 office 4 3
Created on 2020-03-12 by the reprex package (v0.3.0)
As you noticed, the problem with that approach is that the j expression that gets evaluated for every group (= every Property) the numbers of all apartment/office rows are counted not just the one of the current Property. This could be avoided by subsetting not only for the correct type but also the correct property (available in j as the special variable .BY). This already yields the desired result:
rooms_dt[, .(
Type=Type,
number_of_offices=nrow(rooms_dt[rooms_dt$Property == .BY & rooms_dt$Type=="office",]),
number_of_apartments=nrow(rooms_dt[rooms_dt$Property == .BY & rooms_dt$Type=="apartment",])
), by="Property"]
#> Property Type number_of_offices number_of_apartments
#> 1: 1 apartment 1 1
#> 2: 1 office 1 1
#> 3: 2 office 2 0
#> 4: 2 office 2 0
#> 5: 3 apartment 1 2
#> 6: 3 apartment 1 2
#> 7: 3 office 1 2
However, this solution can be simplified quite a bit:
First and foremost, the columns of the data.table can be accessed directly by their name within []:
rooms_dt[, .(
Type=Type,
number_of_offices=nrow(rooms_dt[Property == .BY & Type=="office",]),
number_of_apartments=nrow(rooms_dt[Property == .BY & Type=="apartment",])
), by="Property"]
Furthermore, data.table provides the current data.table subsetted [sic] to the current group as .SD, which allows to simplify this further:
rooms_dt[, .(
Type=Type,
number_of_offices=nrow(.SD[Type=="office",]),
number_of_apartments=nrow(.SD[Type=="apartment",])
), by="Property"]
Since not only .SD is subsetted, but all columns are, it is even simpler to just work with them directly:
rooms_dt[, .(
Type = Type,
number_of_offices = sum(Type=="office"),
number_of_apartments = sum(Type=="apartment")
), by="Property"]
One of the strengths of data.table is that in contrast to most R stuff is, that they are mutable. This allows to just add the new columns directly to the old data.table without creating a new one:
rooms_dt[, number_of_offices := sum(Type=="office"), by="Property"]
rooms_dt[, number_of_apartments := sum(Type=="apartment"), by="Property"]
This culd also be written as:
rooms_dt[, c("number_of_offices", "number_of_apartments") := list(sum(Type=="office"), sum(Type=="apartment")), by="Property"]
or as :
rooms_dt[, `:=`(number_of_offices = sum(Type=="office"), number_of_apartments = sum(Type=="apartment")), by="Property"]
This was not as easy as it could be, probably because it is rearly needed, often it is desirable to have one of these instead:
rooms_dt[, count_in_property := .N, by=.(Property, Type)]
rooms_dt
#> Property Type number_in_property
#> 1: 1 apartment 1
#> 2: 1 office 1
#> 3: 2 office 2
#> 4: 2 office 2
#> 5: 3 apartment 2
#> 6: 3 apartment 2
#> 7: 3 office 1
or
dcast(rooms_dt[, .(number_of = .N), by=.(Property, Type)], Property ~ Type, fill = 0)
#> Using 'number_of' as value column. Use 'value.var' to override
#> Property apartment office
#> 1: 1 1 1
#> 2: 2 0 2
#> 3: 3 2 1