— Здесь вы сможете найти отзывы по банкам из таких городов
    как Москва, Санкт-Петербург, Новгород и многих других

After that, We watched Shanth’s kernel on starting additional features regarding `bureau

After that, We watched Shanth’s kernel on starting additional features regarding `bureau

Function Technology

csv` table, and that i started initially

Place for ADS
to Yahoo many things such as «Just how to winnings an effective Kaggle battle». The results asserted that the secret to winning try feature technologies. Very, I decided to function professional, however, since i don’t really know Python I’m able to perhaps not manage they on the fork off Oliver, thus i went back so you can kxx’s code. We element engineered certain content centered on Shanth’s kernel (We hands-published out all the kinds. ) then provided it towards xgboost. They got regional Curriculum vitae off 0.772, along with personal Lb away from 0.768 and private Pound off 0.773. So, my element technology failed to assist. Darn! Up to now We was not thus reliable regarding xgboost, and so i attempted to write the fresh code to use `glmnet` using collection `caret`, but I didn’t know how to fix a mistake I got while using the `tidyverse`, therefore i eliminated. You can view my personal code from the clicking right here.

may twenty seven-31 I returned so you’re able to Olivier’s kernel, however, I discovered which i didn’t just only need to carry out the suggest on historic tables. I am able to manage mean, sum, and you will standard departure. It had been difficult for me since i have did not discover Python most well. However, ultimately on 29 I rewrote the latest password to provide this type of aggregations. This got local Curriculum vitae significant link regarding 0.783, public Pound 0.780 and private Lb 0.780. You will see my code by the clicking here.

This new finding

I was on collection focusing on the competition on 29. I did so some ability engineering to help make new features. In the event you failed to learn, function engineering is very important whenever building models because it allows your models and find out habits simpler than just for those who just utilized the brutal have. The key of them We made was indeed `DAYS_Beginning / DAYS_EMPLOYED`, `APPLICATION_OCCURS_ON_WEEKEND`, `DAYS_Membership / DAYS_ID_PUBLISH`, and others. To spell it out owing to analogy, should your `DAYS_BIRTH` is big your `DAYS_EMPLOYED` is very quick, consequently you’re dated but you haven’t did within employment for a long period of time (possibly as you got discharged at the history job), that can suggest coming trouble into the trying to repay the loan. New proportion `DAYS_Beginning / DAYS_EMPLOYED` can also be show the risk of the fresh new applicant better than this new intense possess. And work out a great amount of has actually like this wound up permitting out a bunch. You can see an entire dataset I created by pressing right here.

For instance the hands-crafted have, my personal local Cv raised to 0.787, and my personal societal Lb is actually 0.790, having personal Pound on 0.785. Easily recall accurately, up to now I was score 14 towards leaderboard and you can I became freaking away! (It actually was a large diving of my 0.780 to 0.790). You can find my code by clicking here.

24 hours later, I became able to get social Lb 0.791 and private Pound 0.787 adding booleans named `is_nan` for the majority of of the columns inside the `application_train.csv`. Including, when your feedback for your home was NULL, up coming maybe this indicates you have a different type of home that simply cannot become counted. You can see the dataset by clicking right here.

One day I tried tinkering so much more with various philosophy regarding `max_depth`, `num_leaves` and `min_data_in_leaf` having LightGBM hyperparameters, however, I did not get any developments. In the PM regardless if, We submitted a comparable code only with the latest haphazard seed products altered, and that i got personal Pound 0.792 and you will exact same personal Pound.

Stagnation

I attempted upsampling, going back to xgboost inside the R, deleting `EXT_SOURCE_*`, removing columns with reduced difference, using catboost, and ultizing enough Scirpus’s Genetic Programming has (in fact, Scirpus’s kernel turned the brand new kernel We put LightGBM inside the today), however, I was not able to increase toward leaderboard. I found myself together with looking doing geometric imply and hyperbolic imply while the combines, however, I did not come across good results often.

Внимание! Всем желающим получить кредит необходимо заполнить ВСЕ поля в данной форме. После заполнения наш специалист по телефону предложит вам оптимальные варианты.

Добавить комментарий