{"id":3500,"date":"2025-10-05T00:00:00","date_gmt":"2025-10-04T22:00:00","guid":{"rendered":"https:\/\/tecnologia.euroinnova.com\/overfitting\/"},"modified":"2025-10-07T15:00:09","modified_gmt":"2025-10-07T13:00:09","slug":"overfitting","status":"publish","type":"post","link":"https:\/\/tecnologia.euroinnova.com\/en\/overfitting","title":{"rendered":"Overfitting"},"content":{"rendered":"<p class=\"text-align-justify\">The term \u00ab<strong>overfitting<\/strong>\u00bbin machine learning refers to a problem that arises when <strong>a model fits too well to the training data<\/strong>, This leads to a reduction in their ability to generalise well on new data that have not been seen during the training process.&nbsp;&nbsp;<\/p>\n<p class=\"text-align-justify\">In other words, the model fits very well with the <strong>particularities<\/strong> and the <strong>noise<\/strong> present in the training data set, but loses the ability to identify meaningful patterns that can be applied to previously unseen data. This concept is also known as \u00ab<strong>overadjustment<\/strong>\u00ab.&nbsp;<\/p>\n<h2 class=\"text-align-justify\" id=\"consecuencias-del-sobreajuste\">Consequences of over-adjustment<\/h2>\n<p class=\"text-align-justify\">The <strong>over-fitted models<\/strong> often exhibit high accuracy on the training data set, but show poor accuracy on new data, known as the test set or validation set.&nbsp;&nbsp;<\/p>\n<p class=\"text-align-justify\">Overfitting occurs because the model tries to find rules of thumb in the training sample that, in reality, do not exist and, instead, the model tries to find rules of thumb in the training sample that, in reality, do not exist, <strong>finds structures and patterns in the noise of the training sample<\/strong>.&nbsp;<\/p>\n<p class=\"text-align-justify\">Some <strong>signals<\/strong> that indicate that a model may be overtrained are:&nbsp;<\/p>\n<ul>\n<li>\n<p class=\"text-align-justify\"><strong>Wide variation<\/strong> in model performance metrics between the training and validation datasets.&nbsp;<\/p>\n<\/li>\n<li>\n<p class=\"text-align-justify\"><strong>Low generalisation<\/strong> of the model when used on previously unseen data.&nbsp;<\/p>\n<\/li>\n<li>\n<p class=\"text-align-justify\"><strong>Excessive complexity<\/strong> in the structure of the model compared to the signal-to-noise ratio of the data.&nbsp;<\/p>\n<\/li>\n<\/ul>\n<p class=\"text-align-justify\">The consequences of overfitting can be very negative for the overall performance of a model, as it loses the ability to effectively predict or classify new or unpublished data. Therefore, detecting and preventing overfitting must be an integral part of the machine learning process.&nbsp;<\/p>\n<h2 class=\"text-align-justify\" id=\"como-prevenir-el-sobreajuste\">How to prevent over-adjustment?<\/h2>\n<p class=\"text-align-justify\">For <strong>prevent<\/strong> overfitting, various strategies can be employed:&nbsp;<\/p>\n<ul>\n<li>\n<p class=\"text-align-justify\"><strong>Using regularisation techniques<\/strong>The model losses are penalised by adding a penalty to the model losses depending on the complexity of the model. This encourages simplicity and reduces the model's ability to over-fit the training data.&nbsp;<\/p>\n<\/li>\n<li>\n<p class=\"text-align-justify\"><strong>Increase the size of the dataset<\/strong>providing the model with more examples in the training set can help minimise overfitting, as the likelihood of the model memorising the particulars of the training set is reduced.&nbsp;<\/p>\n<\/li>\n<li>\n<p class=\"text-align-justify\"><strong>Use <\/strong><a href=\"https:\/\/tecnologia.euroinnova.com\/en\/validacion-cruzada\/\"><strong>cross-validation<\/strong>:<\/a> consists of dividing the training data set into several subsets and training the model on these subsets while evaluating it on the rest. In this way, a more accurate estimate of the model's performance on unknown data can be obtained.&nbsp;<\/p>\n<\/li>\n<li>\n<p class=\"text-align-justify\"><strong>Reducing the complexity of the model<\/strong>Simplifying the model structure, such as reducing the number of parameters or the depth of the model in decision trees, can help reduce the risk of overfitting.&nbsp;<\/p>\n<\/li>\n<\/ul>\n<p class=\"text-align-justify\">&nbsp;<\/p>\n<h2 class=\"text-align-justify\" id=\"la-varianza-y-el-overfitting-en-el-sobreajuste\">Variance and overfitting in overfitting<\/h2>\n<p class=\"text-align-justify\">The concept of overfitting is closely related to the concept of \u00aboverfitting\".\u00ab<strong>variance-bias trade-off<\/strong>\u00bbin machine learning. <strong>Variance and bias<\/strong> are properties of a model that influence its predictive performance:&nbsp;<\/p>\n<ul>\n<li>\n<p class=\"text-align-justify\">The <a href=\"https:\/\/tecnologia.euroinnova.com\/en\/sesgo-estadistica\/\"><strong>bias<\/strong> <\/a>refers to the simplicity of the model and the ability to ignore noise in the data. A model with a high bias oversimplifies the relationship between input and output data, which can result in poor prediction in training and test data sets.&nbsp;<\/p>\n<\/li>\n<li>\n<p class=\"text-align-justify\">The <strong>variance<\/strong> refers to the sensitivity of the model to noise in the training data. A model with a high variance captures even noise in the training data set, leading to overfitting.&nbsp;<\/p>\n<\/li>\n<\/ul>\n<p class=\"text-align-justify\">It is important to find an optimal balance between bias and variance, as both extremes can be detrimental to model performance. A model with high variance and low bias over-fits the data, while a model with low bias and high variance suffers from bias and does not fit the data well enough.&nbsp;<\/p>","protected":false},"excerpt":{"rendered":"<p>El t\u00e9rmino \u00aboverfitting\u00bb en el aprendizaje autom\u00e1tico se refiere a un problema que surge cuando un modelo se ajusta demasiado [&hellip;]<\/p>\n","protected":false},"author":2,"featured_media":0,"comment_status":"closed","ping_status":"closed","sticky":false,"template":"","format":"standard","meta":{"_acf_changed":false,"site-sidebar-layout":"default","site-content-layout":"","ast-site-content-layout":"default","site-content-style":"default","site-sidebar-style":"default","ast-global-header-display":"","ast-banner-title-visibility":"","ast-main-header-display":"","ast-hfb-above-header-display":"","ast-hfb-below-header-display":"","ast-hfb-mobile-header-display":"","site-post-title":"","ast-breadcrumbs-content":"","ast-featured-img":"","footer-sml-layout":"","ast-disable-related-posts":"","theme-transparent-header-meta":"","adv-header-id-meta":"","stick-header-meta":"","header-above-stick-meta":"","header-main-stick-meta":"","header-below-stick-meta":"","astra-migrate-meta-layouts":"default","ast-page-background-enabled":"default","ast-page-background-meta":{"desktop":{"background-color":"var(--ast-global-color-5)","background-image":"","background-repeat":"repeat","background-position":"center center","background-size":"auto","background-attachment":"scroll","background-type":"","background-media":"","overlay-type":"","overlay-color":"","overlay-opacity":"","overlay-gradient":""},"tablet":{"background-color":"","background-image":"","background-repeat":"repeat","background-position":"center center","background-size":"auto","background-attachment":"scroll","background-type":"","background-media":"","overlay-type":"","overlay-color":"","overlay-opacity":"","overlay-gradient":""},"mobile":{"background-color":"","background-image":"","background-repeat":"repeat","background-position":"center center","background-size":"auto","background-attachment":"scroll","background-type":"","background-media":"","overlay-type":"","overlay-color":"","overlay-opacity":"","overlay-gradient":""}},"ast-content-background-meta":{"desktop":{"background-color":"var(--ast-global-color-4)","background-image":"","background-repeat":"repeat","background-position":"center center","background-size":"auto","background-attachment":"scroll","background-type":"","background-media":"","overlay-type":"","overlay-color":"","overlay-opacity":"","overlay-gradient":""},"tablet":{"background-color":"var(--ast-global-color-4)","background-image":"","background-repeat":"repeat","background-position":"center center","background-size":"auto","background-attachment":"scroll","background-type":"","background-media":"","overlay-type":"","overlay-color":"","overlay-opacity":"","overlay-gradient":""},"mobile":{"background-color":"var(--ast-global-color-4)","background-image":"","background-repeat":"repeat","background-position":"center center","background-size":"auto","background-attachment":"scroll","background-type":"","background-media":"","overlay-type":"","overlay-color":"","overlay-opacity":"","overlay-gradient":""}},"footnotes":""},"categories":[25],"tags":[],"class_list":["post-3500","post","type-post","status-publish","format-standard","hentry","category-metaterminos"],"acf":[],"_links":{"self":[{"href":"https:\/\/tecnologia.euroinnova.com\/en\/wp-json\/wp\/v2\/posts\/3500","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/tecnologia.euroinnova.com\/en\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/tecnologia.euroinnova.com\/en\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/tecnologia.euroinnova.com\/en\/wp-json\/wp\/v2\/users\/2"}],"replies":[{"embeddable":true,"href":"https:\/\/tecnologia.euroinnova.com\/en\/wp-json\/wp\/v2\/comments?post=3500"}],"version-history":[{"count":0,"href":"https:\/\/tecnologia.euroinnova.com\/en\/wp-json\/wp\/v2\/posts\/3500\/revisions"}],"wp:attachment":[{"href":"https:\/\/tecnologia.euroinnova.com\/en\/wp-json\/wp\/v2\/media?parent=3500"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/tecnologia.euroinnova.com\/en\/wp-json\/wp\/v2\/categories?post=3500"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/tecnologia.euroinnova.com\/en\/wp-json\/wp\/v2\/tags?post=3500"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}