I wish to evaluate marginal effects of variables in a logit regression using a dataset like this (with 40k observations):
d1<- structure(list(dummy.eleito = c(1, 0, 0, 0, 0, 1, 1, 1, 1, 0),
dummy.tratamento = c(1, 1, 0, 0, 0, 0, 0, 0, 0, 0),
Escolaridade = c("SUPERIOR_INCOMPLETO", "FUNDAMENTAL_INCOMPLETO",
"SUPERIOR_COMPLETO", "FUNDAMENTAL_INCOMPLETO",
"SUPERIOR_COMPLETO", "SUPERIOR_COMPLETO", "SUPERIOR_INCOMPLETO",
"SUPERIOR_INCOMPLETO", "SUPERIOR_COMPLETO", "SUPERIOR_INCOMPLETO"),
Raca = c("Preta_Parda", "Preta_Parda", "Preta_Parda", "Preta_Parda",
"Preta_Parda", "Preta_Parda", "BRANCA", "BRANCA", "BRANCA", "BRANCA"),
DESCRICAO_SEXO = c("MASCULINO", "MASCULINO", "MASCULINO",
"MASCULINO", "MASCULINO", "MASCULINO", "MASCULINO",
"MASCULINO", "MASCULINO", "MASCULINO"),
votos.cidade = c(6483, 6483, 6483, 6483, 6483, 6483, 4735,
4735, 4735, 4735),
dummy.prefeito = c(0,1, 0, 0, 0, 1, 0, 0, 0, 1),
Intensidade.Trat0.Mun = c(0.0152671755725191, 0.0152671755725191, 0.0152671755725191, 0.0152671751,
0.0152671755725191, 0.01526717, 0.02857142856, 0.028571428, 0.028571, 0.0285714),
Var.Receitas = c(3.25607407, 11.424, 4.5549, -0.832116880227985, 5.78901737320675, -0.02459246,
1.151009, -0.3058719238, 0.742947247, -0.2711)),
.Names = c("dummy.eleito", "dummy.tratamento", "Escolaridade", "Raca",
"DESCRICAO_SEXO", "votos.cidade", "dummy.prefeito", "Intensidade.Trat0.Mun",
"Var.Receitas"), row.names = c(NA, 10L), class = "data.frame")
I run the following regression using glm:
model <- glm(dummy.eleito ~ dummy.tratamento + factor(Escolaridade) +
factor(Raca) + factor(DESCRICAO_SEXO) +
votos.cidade + dummy.prefeito +
dummy.tratamento:Intensidade.Trat0.Mun +
Var.Receitas + Var.Receitas:dummy.tratamento,
data = d1,
family = binomial(link = 'logit'))
Then I evaluate marginal effects at some points:
m <- margins(model, at = list(dummy.tratamento = 1,
Intensidade.Trat0.Mun = fivenum(d1$Intensidade.Trat0.Mun)
Var.Receitas = fivenum(d1$Var.Receitas))
R tried to run this through the whole night... at the morning, still nothing. Is that normal? Any possible reason? Is the data too complex? Or maybe the regression formula itself? Even if I ran margins without using the at specification it still would not go.
Any help?
EDIT:
After updating R, to its newest version, this is what I got in the end:
Running the regressions I needed and the margins command using the entire dataset, R took time to do the job, but it did in the end.
However, the problem persisted when using the at parameter inside margins. I suspect it is because the regression has factor variables. I think I will probably calculate by hand predicted values of my dependent variable using the parameters that I would put inside the at command, just to get a grasp of the results.
Any suggested alternatives are welcome.