So, all the coefficients are the same but  for CRUZADAS? How are you fitting the model in R (glm)?  Can you try setting zero penalty for alpha and lambda:

  .setRegParam(0)
  .setElasticNetParam(0)

Cheers,
S


Am 24.10.17 um 13:19 schrieb Alexis Peña:

Thanks for your Answer, the features “Cruzadas” are Binaries (0/1). The chisq statistic must be work whit 2x2 tables.

i fit the model in SAS and R and both the coeff have estimates (not significant). Two of this kind of features has estimations

CRUZADAS

        

4907

        

0,247624087

CRUZADAS

        

5304

        

-0,161424508

Thanks

*De: *Weichen Xu <weichen...@databricks.com>
*Fecha: *martes, 24 de octubre de 2017, 07:23
*Para: *Alexis Peña <alexis.p...@exalitica.com>
*CC: *"user @spark" <user@spark.apache.org>
*Asunto: *Re: Zero Coefficient in logistic regression

Yes chi-squared statistic only used in categorical features. It looks not proper here.

Thanks!

On Tue, Oct 24, 2017 at 5:13 PM, Simon Dirmeier <simon.dirme...@web.de <mailto:simon.dirme...@web.de>> wrote:

    Hey,

    as far as I know feature selection using the a chi-squared
    statistic, can only be done on categorical features and not on
    possibly continuous ones?
    Furthermore, since your logistic model doesn't use any
    regularization, you should be fine here. So I'd check the
    ChiSqSeletor and possibly replace it with another feature
    selection method.

    There is however always the chance that your response does not
    depend on your covariables, so you'd estimate a zero coefficient.

    Cheers,
    Simon

    Am 24.10.17 um 04:56 schrieb Alexis Peña:

        Hi Guys,

        We are fitting a Logistic model using the following code.

        val Chisqselector = new
        
ChiSqSelector().setNumTopFeatures(10).setFeaturesCol("VECTOR_1").setLabelCol("TARGET").setOutputCol("selectedFeatures")

        val assembler = new
        VectorAssembler().setInputCols(Array("FEATURES",
        "selectedFeatures", "PROM_MESES_DIST", "RECENCIA", "TEMP_MIN",
        "TEMP_MAX", "PRECIPITACIONES")).setOutputCol("Union")

        val lr = new
        LogisticRegression().setLabelCol("TARGET").setFeaturesCol("Union")

        val pipeline = new Pipeline().setStages(Array(Chisqselector,
        assembler, lr))

        do you know why the coeff for  the following features are zero
        estimate, is it  produced in ChisqSelector or Logistic model?

        Thanks in advance!!

        CODIGO

                

        PARAMETRO

                

        COEFICIENTES_MUESTREO_BALANCEADO

        PROPIAS

                

        CV_UM

                

        0,276866756

        PROPIAS

                

        CV_U3M

                

        -0,241851427

        PROPIAS

                

        CV_U6M

                

        -0,568312819

        PROPIAS

                

        CV_U12M

                

        0,134706601

        PROPIAS

                

        M_UM

                

        5,47E-06

        PROPIAS

                

        M_U3M

                

        -7,10E-06

        PROPIAS

                

        M_U6M

                

        1,73E-05

        PROPIAS

                

        M_U12M

                

        -5,41E-06

        PROPIAS

                

        CP_UM

                

        -0,050750105

        PROPIAS

                

        CP_U3M

                

        0,125483162

        PROPIAS

                

        CP_U6M

                

        -0,353906788

        PROPIAS

                

        CP_U12M

                

        0,159538155

        PROPIAS

                

        TUM

                

        -0,020217902

        PROPIAS

                

        TU3M

                

        0,002101906

        PROPIAS

                

        TU6M

                

        -0,005481915

        PROPIAS

                

        TU12M

                

        0,003443081

        CRUZADAS

                

        2303

                

        0

        CRUZADAS

                

        3901

                

        0

        CRUZADAS

                

        3905

                

        0

        CRUZADAS

                

        3907

                

        0

        CRUZADAS

                

        3909

                

        0

        CRUZADAS

                

        4102

                

        0

        CRUZADAS

                

        4307

                

        0

        CRUZADAS

                

        4501

                

        0

        CRUZADAS

                

        4907

                

        0,247624087

        CRUZADAS

                

        5304

                

        -0,161424508

        LP

                

        PROM_MESES_DIST

                

        -0,680356554

        PROPIAS

                

        RECENCIA

                

        -0,00289069

        EXTERNAS

                

        TEMP_MIN

                

        0,006488683

        EXTERNAS

                

        TEMP_MAX

                

        -0,013497441

        EXTERNAS

                

        PRECIPITACIONES

                

        -0,007607086

        INTERCEPTO

                
                

        2,401593191


Reply via email to