| <!DOCTYPE html> |
| <html xmlns="http://www.w3.org/1999/xhtml" lang="en"> |
| <head> |
| <meta charset="UTF-8"/> |
| <meta http-equiv="X-UA-Compatible" content="IE=edge"/> |
| <meta name="viewport" content="width=device-width, initial-scale=1.0"/> |
| <meta name="generator" content="Asciidoctor 2.0.23"/> |
| <title>Concerning Git’s Packing Heuristics</title> |
| <link rel="stylesheet" href="https://fonts.googleapis.com/css?family=Open+Sans:300,300italic,400,400italic,600,600italic%7CNoto+Serif:400,400italic,700,700italic%7CDroid+Sans+Mono:400,700"/> |
| <style> |
| /*! Asciidoctor default stylesheet | MIT License | https://asciidoctor.org */ |
| /* Uncomment the following line when using as a custom stylesheet */ |
| /* @import "https://fonts.googleapis.com/css?family=Open+Sans:300,300italic,400,400italic,600,600italic%7CNoto+Serif:400,400italic,700,700italic%7CDroid+Sans+Mono:400,700"; */ |
| html{font-family:sans-serif;-webkit-text-size-adjust:100%} |
| a{background:none} |
| a:focus{outline:thin dotted} |
| a:active,a:hover{outline:0} |
| h1{font-size:2em;margin:.67em 0} |
| b,strong{font-weight:bold} |
| abbr{font-size:.9em} |
| abbr[title]{cursor:help;border-bottom:1px dotted #dddddf;text-decoration:none} |
| dfn{font-style:italic} |
| hr{height:0} |
| mark{background:#ff0;color:#000} |
| code,kbd,pre,samp{font-family:monospace;font-size:1em} |
| pre{white-space:pre-wrap} |
| q{quotes:"\201C" "\201D" "\2018" "\2019"} |
| small{font-size:80%} |
| sub,sup{font-size:75%;line-height:0;position:relative;vertical-align:baseline} |
| sup{top:-.5em} |
| sub{bottom:-.25em} |
| img{border:0} |
| svg:not(:root){overflow:hidden} |
| figure{margin:0} |
| audio,video{display:inline-block} |
| audio:not([controls]){display:none;height:0} |
| fieldset{border:1px solid silver;margin:0 2px;padding:.35em .625em .75em} |
| legend{border:0;padding:0} |
| button,input,select,textarea{font-family:inherit;font-size:100%;margin:0} |
| button,input{line-height:normal} |
| button,select{text-transform:none} |
| button,html input[type=button],input[type=reset],input[type=submit]{-webkit-appearance:button;cursor:pointer} |
| button[disabled],html input[disabled]{cursor:default} |
| input[type=checkbox],input[type=radio]{padding:0} |
| button::-moz-focus-inner,input::-moz-focus-inner{border:0;padding:0} |
| textarea{overflow:auto;vertical-align:top} |
| table{border-collapse:collapse;border-spacing:0} |
| *,::before,::after{box-sizing:border-box} |
| html,body{font-size:100%} |
| body{background:#fff;color:rgba(0,0,0,.8);padding:0;margin:0;font-family:"Noto Serif","DejaVu Serif",serif;line-height:1;position:relative;cursor:auto;-moz-tab-size:4;-o-tab-size:4;tab-size:4;word-wrap:anywhere;-moz-osx-font-smoothing:grayscale;-webkit-font-smoothing:antialiased} |
| a:hover{cursor:pointer} |
| img,object,embed{max-width:100%;height:auto} |
| object,embed{height:100%} |
| img{-ms-interpolation-mode:bicubic} |
| .left{float:left!important} |
| .right{float:right!important} |
| .text-left{text-align:left!important} |
| .text-right{text-align:right!important} |
| .text-center{text-align:center!important} |
| .text-justify{text-align:justify!important} |
| .hide{display:none} |
| img,object,svg{display:inline-block;vertical-align:middle} |
| textarea{height:auto;min-height:50px} |
| select{width:100%} |
| .subheader,.admonitionblock td.content>.title,.audioblock>.title,.exampleblock>.title,.imageblock>.title,.listingblock>.title,.literalblock>.title,.stemblock>.title,.openblock>.title,.paragraph>.title,.quoteblock>.title,table.tableblock>.title,.verseblock>.title,.videoblock>.title,.dlist>.title,.olist>.title,.ulist>.title,.qlist>.title,.hdlist>.title{line-height:1.45;color:#7a2518;font-weight:400;margin-top:0;margin-bottom:.25em} |
| div,dl,dt,dd,ul,ol,li,h1,h2,h3,#toctitle,.sidebarblock>.content>.title,h4,h5,h6,pre,form,p,blockquote,th,td{margin:0;padding:0} |
| a{color:#2156a5;text-decoration:underline;line-height:inherit} |
| a:hover,a:focus{color:#1d4b8f} |
| a img{border:0} |
| p{line-height:1.6;margin-bottom:1.25em;text-rendering:optimizeLegibility} |
| p aside{font-size:.875em;line-height:1.35;font-style:italic} |
| h1,h2,h3,#toctitle,.sidebarblock>.content>.title,h4,h5,h6{font-family:"Open Sans","DejaVu Sans",sans-serif;font-weight:300;font-style:normal;color:#ba3925;text-rendering:optimizeLegibility;margin-top:1em;margin-bottom:.5em;line-height:1.0125em} |
| h1 small,h2 small,h3 small,#toctitle small,.sidebarblock>.content>.title small,h4 small,h5 small,h6 small{font-size:60%;color:#e99b8f;line-height:0} |
| h1{font-size:2.125em} |
| h2{font-size:1.6875em} |
| h3,#toctitle,.sidebarblock>.content>.title{font-size:1.375em} |
| h4,h5{font-size:1.125em} |
| h6{font-size:1em} |
| hr{border:solid #dddddf;border-width:1px 0 0;clear:both;margin:1.25em 0 1.1875em} |
| em,i{font-style:italic;line-height:inherit} |
| strong,b{font-weight:bold;line-height:inherit} |
| small{font-size:60%;line-height:inherit} |
| code{font-family:"Droid Sans Mono","DejaVu Sans Mono",monospace;font-weight:400;color:rgba(0,0,0,.9)} |
| ul,ol,dl{line-height:1.6;margin-bottom:1.25em;list-style-position:outside;font-family:inherit} |
| ul,ol{margin-left:1.5em} |
| ul li ul,ul li ol{margin-left:1.25em;margin-bottom:0} |
| ul.circle{list-style-type:circle} |
| ul.disc{list-style-type:disc} |
| ul.square{list-style-type:square} |
| ul.circle ul:not([class]),ul.disc ul:not([class]),ul.square ul:not([class]){list-style:inherit} |
| ol li ul,ol li ol{margin-left:1.25em;margin-bottom:0} |
| dl dt{margin-bottom:.3125em;font-weight:bold} |
| dl dd{margin-bottom:1.25em} |
| blockquote{margin:0 0 1.25em;padding:.5625em 1.25em 0 1.1875em;border-left:1px solid #ddd} |
| blockquote,blockquote p{line-height:1.6;color:rgba(0,0,0,.85)} |
| @media screen and (min-width:768px){h1,h2,h3,#toctitle,.sidebarblock>.content>.title,h4,h5,h6{line-height:1.2} |
| h1{font-size:2.75em} |
| h2{font-size:2.3125em} |
| h3,#toctitle,.sidebarblock>.content>.title{font-size:1.6875em} |
| h4{font-size:1.4375em}} |
| table{background:#fff;margin-bottom:1.25em;border:1px solid #dedede;word-wrap:normal} |
| table thead,table tfoot{background:#f7f8f7} |
| table thead tr th,table thead tr td,table tfoot tr th,table tfoot tr td{padding:.5em .625em .625em;font-size:inherit;color:rgba(0,0,0,.8);text-align:left} |
| table tr th,table tr td{padding:.5625em .625em;font-size:inherit;color:rgba(0,0,0,.8)} |
| table tr.even,table tr.alt{background:#f8f8f7} |
| table thead tr th,table tfoot tr th,table tbody tr td,table tr td,table tfoot tr td{line-height:1.6} |
| h1,h2,h3,#toctitle,.sidebarblock>.content>.title,h4,h5,h6{line-height:1.2;word-spacing:-.05em} |
| h1 strong,h2 strong,h3 strong,#toctitle strong,.sidebarblock>.content>.title strong,h4 strong,h5 strong,h6 strong{font-weight:400} |
| .center{margin-left:auto;margin-right:auto} |
| .stretch{width:100%} |
| .clearfix::before,.clearfix::after,.float-group::before,.float-group::after{content:" ";display:table} |
| .clearfix::after,.float-group::after{clear:both} |
| :not(pre).nobreak{word-wrap:normal} |
| :not(pre).nowrap{white-space:nowrap} |
| :not(pre).pre-wrap{white-space:pre-wrap} |
| :not(pre):not([class^=L])>code{font-size:.9375em;font-style:normal!important;letter-spacing:0;padding:.1em .5ex;word-spacing:-.15em;background:#f7f7f8;border-radius:4px;line-height:1.45;text-rendering:optimizeSpeed} |
| pre{color:rgba(0,0,0,.9);font-family:"Droid Sans Mono","DejaVu Sans Mono",monospace;line-height:1.45;text-rendering:optimizeSpeed} |
| pre code,pre pre{color:inherit;font-size:inherit;line-height:inherit} |
| pre>code{display:block} |
| pre.nowrap,pre.nowrap pre{white-space:pre;word-wrap:normal} |
| em em{font-style:normal} |
| strong strong{font-weight:400} |
| .keyseq{color:rgba(51,51,51,.8)} |
| kbd{font-family:"Droid Sans Mono","DejaVu Sans Mono",monospace;display:inline-block;color:rgba(0,0,0,.8);font-size:.65em;line-height:1.45;background:#f7f7f7;border:1px solid #ccc;border-radius:3px;box-shadow:0 1px 0 rgba(0,0,0,.2),inset 0 0 0 .1em #fff;margin:0 .15em;padding:.2em .5em;vertical-align:middle;position:relative;top:-.1em;white-space:nowrap} |
| .keyseq kbd:first-child{margin-left:0} |
| .keyseq kbd:last-child{margin-right:0} |
| .menuseq,.menuref{color:#000} |
| .menuseq b:not(.caret),.menuref{font-weight:inherit} |
| .menuseq{word-spacing:-.02em} |
| .menuseq b.caret{font-size:1.25em;line-height:.8} |
| .menuseq i.caret{font-weight:bold;text-align:center;width:.45em} |
| b.button::before,b.button::after{position:relative;top:-1px;font-weight:400} |
| b.button::before{content:"[";padding:0 3px 0 2px} |
| b.button::after{content:"]";padding:0 2px 0 3px} |
| p a>code:hover{color:rgba(0,0,0,.9)} |
| #header,#content,#footnotes,#footer{width:100%;margin:0 auto;max-width:62.5em;*zoom:1;position:relative;padding-left:.9375em;padding-right:.9375em} |
| #header::before,#header::after,#content::before,#content::after,#footnotes::before,#footnotes::after,#footer::before,#footer::after{content:" ";display:table} |
| #header::after,#content::after,#footnotes::after,#footer::after{clear:both} |
| #content{margin-top:1.25em} |
| #content::before{content:none} |
| #header>h1:first-child{color:rgba(0,0,0,.85);margin-top:2.25rem;margin-bottom:0} |
| #header>h1:first-child+#toc{margin-top:8px;border-top:1px solid #dddddf} |
| #header>h1:only-child{border-bottom:1px solid #dddddf;padding-bottom:8px} |
| #header .details{border-bottom:1px solid #dddddf;line-height:1.45;padding-top:.25em;padding-bottom:.25em;padding-left:.25em;color:rgba(0,0,0,.6);display:flex;flex-flow:row wrap} |
| #header .details span:first-child{margin-left:-.125em} |
| #header .details span.email a{color:rgba(0,0,0,.85)} |
| #header .details br{display:none} |
| #header .details br+span::before{content:"\00a0\2013\00a0"} |
| #header .details br+span.author::before{content:"\00a0\22c5\00a0";color:rgba(0,0,0,.85)} |
| #header .details br+span#revremark::before{content:"\00a0|\00a0"} |
| #header #revnumber{text-transform:capitalize} |
| #header #revnumber::after{content:"\00a0"} |
| #content>h1:first-child:not([class]){color:rgba(0,0,0,.85);border-bottom:1px solid #dddddf;padding-bottom:8px;margin-top:0;padding-top:1rem;margin-bottom:1.25rem} |
| #toc{border-bottom:1px solid #e7e7e9;padding-bottom:.5em} |
| #toc>ul{margin-left:.125em} |
| #toc ul.sectlevel0>li>a{font-style:italic} |
| #toc ul.sectlevel0 ul.sectlevel1{margin:.5em 0} |
| #toc ul{font-family:"Open Sans","DejaVu Sans",sans-serif;list-style-type:none} |
| #toc li{line-height:1.3334;margin-top:.3334em} |
| #toc a{text-decoration:none} |
| #toc a:active{text-decoration:underline} |
| #toctitle{color:#7a2518;font-size:1.2em} |
| @media screen and (min-width:768px){#toctitle{font-size:1.375em} |
| body.toc2{padding-left:15em;padding-right:0} |
| body.toc2 #header>h1:nth-last-child(2){border-bottom:1px solid #dddddf;padding-bottom:8px} |
| #toc.toc2{margin-top:0!important;background:#f8f8f7;position:fixed;width:15em;left:0;top:0;border-right:1px solid #e7e7e9;border-top-width:0!important;border-bottom-width:0!important;z-index:1000;padding:1.25em 1em;height:100%;overflow:auto} |
| #toc.toc2 #toctitle{margin-top:0;margin-bottom:.8rem;font-size:1.2em} |
| #toc.toc2>ul{font-size:.9em;margin-bottom:0} |
| #toc.toc2 ul ul{margin-left:0;padding-left:1em} |
| #toc.toc2 ul.sectlevel0 ul.sectlevel1{padding-left:0;margin-top:.5em;margin-bottom:.5em} |
| body.toc2.toc-right{padding-left:0;padding-right:15em} |
| body.toc2.toc-right #toc.toc2{border-right-width:0;border-left:1px solid #e7e7e9;left:auto;right:0}} |
| @media screen and (min-width:1280px){body.toc2{padding-left:20em;padding-right:0} |
| #toc.toc2{width:20em} |
| #toc.toc2 #toctitle{font-size:1.375em} |
| #toc.toc2>ul{font-size:.95em} |
| #toc.toc2 ul ul{padding-left:1.25em} |
| body.toc2.toc-right{padding-left:0;padding-right:20em}} |
| #content #toc{border:1px solid #e0e0dc;margin-bottom:1.25em;padding:1.25em;background:#f8f8f7;border-radius:4px} |
| #content #toc>:first-child{margin-top:0} |
| #content #toc>:last-child{margin-bottom:0} |
| #footer{max-width:none;background:rgba(0,0,0,.8);padding:1.25em} |
| #footer-text{color:hsla(0,0%,100%,.8);line-height:1.44} |
| #content{margin-bottom:.625em} |
| .sect1{padding-bottom:.625em} |
| @media screen and (min-width:768px){#content{margin-bottom:1.25em} |
| .sect1{padding-bottom:1.25em}} |
| .sect1:last-child{padding-bottom:0} |
| .sect1+.sect1{border-top:1px solid #e7e7e9} |
| #content h1>a.anchor,h2>a.anchor,h3>a.anchor,#toctitle>a.anchor,.sidebarblock>.content>.title>a.anchor,h4>a.anchor,h5>a.anchor,h6>a.anchor{position:absolute;z-index:1001;width:1.5ex;margin-left:-1.5ex;display:block;text-decoration:none!important;visibility:hidden;text-align:center;font-weight:400} |
| #content h1>a.anchor::before,h2>a.anchor::before,h3>a.anchor::before,#toctitle>a.anchor::before,.sidebarblock>.content>.title>a.anchor::before,h4>a.anchor::before,h5>a.anchor::before,h6>a.anchor::before{content:"\00A7";font-size:.85em;display:block;padding-top:.1em} |
| #content h1:hover>a.anchor,#content h1>a.anchor:hover,h2:hover>a.anchor,h2>a.anchor:hover,h3:hover>a.anchor,#toctitle:hover>a.anchor,.sidebarblock>.content>.title:hover>a.anchor,h3>a.anchor:hover,#toctitle>a.anchor:hover,.sidebarblock>.content>.title>a.anchor:hover,h4:hover>a.anchor,h4>a.anchor:hover,h5:hover>a.anchor,h5>a.anchor:hover,h6:hover>a.anchor,h6>a.anchor:hover{visibility:visible} |
| #content h1>a.link,h2>a.link,h3>a.link,#toctitle>a.link,.sidebarblock>.content>.title>a.link,h4>a.link,h5>a.link,h6>a.link{color:#ba3925;text-decoration:none} |
| #content h1>a.link:hover,h2>a.link:hover,h3>a.link:hover,#toctitle>a.link:hover,.sidebarblock>.content>.title>a.link:hover,h4>a.link:hover,h5>a.link:hover,h6>a.link:hover{color:#a53221} |
| details,.audioblock,.imageblock,.literalblock,.listingblock,.stemblock,.videoblock{margin-bottom:1.25em} |
| details{margin-left:1.25rem} |
| details>summary{cursor:pointer;display:block;position:relative;line-height:1.6;margin-bottom:.625rem;outline:none;-webkit-tap-highlight-color:transparent} |
| details>summary::-webkit-details-marker{display:none} |
| details>summary::before{content:"";border:solid transparent;border-left:solid;border-width:.3em 0 .3em .5em;position:absolute;top:.5em;left:-1.25rem;transform:translateX(15%)} |
| details[open]>summary::before{border:solid transparent;border-top:solid;border-width:.5em .3em 0;transform:translateY(15%)} |
| details>summary::after{content:"";width:1.25rem;height:1em;position:absolute;top:.3em;left:-1.25rem} |
| .admonitionblock td.content>.title,.audioblock>.title,.exampleblock>.title,.imageblock>.title,.listingblock>.title,.literalblock>.title,.stemblock>.title,.openblock>.title,.paragraph>.title,.quoteblock>.title,table.tableblock>.title,.verseblock>.title,.videoblock>.title,.dlist>.title,.olist>.title,.ulist>.title,.qlist>.title,.hdlist>.title{text-rendering:optimizeLegibility;text-align:left;font-family:"Noto Serif","DejaVu Serif",serif;font-size:1rem;font-style:italic} |
| table.tableblock.fit-content>caption.title{white-space:nowrap;width:0} |
| .paragraph.lead>p,#preamble>.sectionbody>[class=paragraph]:first-of-type p{font-size:1.21875em;line-height:1.6;color:rgba(0,0,0,.85)} |
| .admonitionblock>table{border-collapse:separate;border:0;background:none;width:100%} |
| .admonitionblock>table td.icon{text-align:center;width:80px} |
| .admonitionblock>table td.icon img{max-width:none} |
| .admonitionblock>table td.icon .title{font-weight:bold;font-family:"Open Sans","DejaVu Sans",sans-serif;text-transform:uppercase} |
| .admonitionblock>table td.content{padding-left:1.125em;padding-right:1.25em;border-left:1px solid #dddddf;color:rgba(0,0,0,.6);word-wrap:anywhere} |
| .admonitionblock>table td.content>:last-child>:last-child{margin-bottom:0} |
| .exampleblock>.content{border:1px solid #e6e6e6;margin-bottom:1.25em;padding:1.25em;background:#fff;border-radius:4px} |
| .sidebarblock{border:1px solid #dbdbd6;margin-bottom:1.25em;padding:1.25em;background:#f3f3f2;border-radius:4px} |
| .sidebarblock>.content>.title{color:#7a2518;margin-top:0;text-align:center} |
| .exampleblock>.content>:first-child,.sidebarblock>.content>:first-child{margin-top:0} |
| .exampleblock>.content>:last-child,.exampleblock>.content>:last-child>:last-child,.exampleblock>.content .olist>ol>li:last-child>:last-child,.exampleblock>.content .ulist>ul>li:last-child>:last-child,.exampleblock>.content .qlist>ol>li:last-child>:last-child,.sidebarblock>.content>:last-child,.sidebarblock>.content>:last-child>:last-child,.sidebarblock>.content .olist>ol>li:last-child>:last-child,.sidebarblock>.content .ulist>ul>li:last-child>:last-child,.sidebarblock>.content .qlist>ol>li:last-child>:last-child{margin-bottom:0} |
| .literalblock pre,.listingblock>.content>pre{border-radius:4px;overflow-x:auto;padding:1em;font-size:.8125em} |
| @media screen and (min-width:768px){.literalblock pre,.listingblock>.content>pre{font-size:.90625em}} |
| @media screen and (min-width:1280px){.literalblock pre,.listingblock>.content>pre{font-size:1em}} |
| .literalblock pre,.listingblock>.content>pre:not(.highlight),.listingblock>.content>pre[class=highlight],.listingblock>.content>pre[class^="highlight "]{background:#f7f7f8} |
| .literalblock.output pre{color:#f7f7f8;background:rgba(0,0,0,.9)} |
| .listingblock>.content{position:relative} |
| .listingblock code[data-lang]::before{display:none;content:attr(data-lang);position:absolute;font-size:.75em;top:.425rem;right:.5rem;line-height:1;text-transform:uppercase;color:inherit;opacity:.5} |
| .listingblock:hover code[data-lang]::before{display:block} |
| .listingblock.terminal pre .command::before{content:attr(data-prompt);padding-right:.5em;color:inherit;opacity:.5} |
| .listingblock.terminal pre .command:not([data-prompt])::before{content:"$"} |
| .listingblock pre.highlightjs{padding:0} |
| .listingblock pre.highlightjs>code{padding:1em;border-radius:4px} |
| .listingblock pre.prettyprint{border-width:0} |
| .prettyprint{background:#f7f7f8} |
| pre.prettyprint .linenums{line-height:1.45;margin-left:2em} |
| pre.prettyprint li{background:none;list-style-type:inherit;padding-left:0} |
| pre.prettyprint li code[data-lang]::before{opacity:1} |
| pre.prettyprint li:not(:first-child) code[data-lang]::before{display:none} |
| table.linenotable{border-collapse:separate;border:0;margin-bottom:0;background:none} |
| table.linenotable td[class]{color:inherit;vertical-align:top;padding:0;line-height:inherit;white-space:normal} |
| table.linenotable td.code{padding-left:.75em} |
| table.linenotable td.linenos,pre.pygments .linenos{border-right:1px solid;opacity:.35;padding-right:.5em;-webkit-user-select:none;-moz-user-select:none;-ms-user-select:none;user-select:none} |
| pre.pygments span.linenos{display:inline-block;margin-right:.75em} |
| .quoteblock{margin:0 1em 1.25em 1.5em;display:table} |
| .quoteblock:not(.excerpt)>.title{margin-left:-1.5em;margin-bottom:.75em} |
| .quoteblock blockquote,.quoteblock p{color:rgba(0,0,0,.85);font-size:1.15rem;line-height:1.75;word-spacing:.1em;letter-spacing:0;font-style:italic;text-align:justify} |
| .quoteblock blockquote{margin:0;padding:0;border:0} |
| .quoteblock blockquote::before{content:"\201c";float:left;font-size:2.75em;font-weight:bold;line-height:.6em;margin-left:-.6em;color:#7a2518;text-shadow:0 1px 2px rgba(0,0,0,.1)} |
| .quoteblock blockquote>.paragraph:last-child p{margin-bottom:0} |
| .quoteblock .attribution{margin-top:.75em;margin-right:.5ex;text-align:right} |
| .verseblock{margin:0 1em 1.25em} |
| .verseblock pre{font-family:"Open Sans","DejaVu Sans",sans-serif;font-size:1.15rem;color:rgba(0,0,0,.85);font-weight:300;text-rendering:optimizeLegibility} |
| .verseblock pre strong{font-weight:400} |
| .verseblock .attribution{margin-top:1.25rem;margin-left:.5ex} |
| .quoteblock .attribution,.verseblock .attribution{font-size:.9375em;line-height:1.45;font-style:italic} |
| .quoteblock .attribution br,.verseblock .attribution br{display:none} |
| .quoteblock .attribution cite,.verseblock .attribution cite{display:block;letter-spacing:-.025em;color:rgba(0,0,0,.6)} |
| .quoteblock.abstract blockquote::before,.quoteblock.excerpt blockquote::before,.quoteblock .quoteblock blockquote::before{display:none} |
| .quoteblock.abstract blockquote,.quoteblock.abstract p,.quoteblock.excerpt blockquote,.quoteblock.excerpt p,.quoteblock .quoteblock blockquote,.quoteblock .quoteblock p{line-height:1.6;word-spacing:0} |
| .quoteblock.abstract{margin:0 1em 1.25em;display:block} |
| .quoteblock.abstract>.title{margin:0 0 .375em;font-size:1.15em;text-align:center} |
| .quoteblock.excerpt>blockquote,.quoteblock .quoteblock{padding:0 0 .25em 1em;border-left:.25em solid #dddddf} |
| .quoteblock.excerpt,.quoteblock .quoteblock{margin-left:0} |
| .quoteblock.excerpt blockquote,.quoteblock.excerpt p,.quoteblock .quoteblock blockquote,.quoteblock .quoteblock p{color:inherit;font-size:1.0625rem} |
| .quoteblock.excerpt .attribution,.quoteblock .quoteblock .attribution{color:inherit;font-size:.85rem;text-align:left;margin-right:0} |
| p.tableblock:last-child{margin-bottom:0} |
| td.tableblock>.content{margin-bottom:1.25em;word-wrap:anywhere} |
| td.tableblock>.content>:last-child{margin-bottom:-1.25em} |
| table.tableblock,th.tableblock,td.tableblock{border:0 solid #dedede} |
| table.grid-all>*>tr>*{border-width:1px} |
| table.grid-cols>*>tr>*{border-width:0 1px} |
| table.grid-rows>*>tr>*{border-width:1px 0} |
| table.frame-all{border-width:1px} |
| table.frame-ends{border-width:1px 0} |
| table.frame-sides{border-width:0 1px} |
| table.frame-none>colgroup+*>:first-child>*,table.frame-sides>colgroup+*>:first-child>*{border-top-width:0} |
| table.frame-none>:last-child>:last-child>*,table.frame-sides>:last-child>:last-child>*{border-bottom-width:0} |
| table.frame-none>*>tr>:first-child,table.frame-ends>*>tr>:first-child{border-left-width:0} |
| table.frame-none>*>tr>:last-child,table.frame-ends>*>tr>:last-child{border-right-width:0} |
| table.stripes-all>*>tr,table.stripes-odd>*>tr:nth-of-type(odd),table.stripes-even>*>tr:nth-of-type(even),table.stripes-hover>*>tr:hover{background:#f8f8f7} |
| th.halign-left,td.halign-left{text-align:left} |
| th.halign-right,td.halign-right{text-align:right} |
| th.halign-center,td.halign-center{text-align:center} |
| th.valign-top,td.valign-top{vertical-align:top} |
| th.valign-bottom,td.valign-bottom{vertical-align:bottom} |
| th.valign-middle,td.valign-middle{vertical-align:middle} |
| table thead th,table tfoot th{font-weight:bold} |
| tbody tr th{background:#f7f8f7} |
| tbody tr th,tbody tr th p,tfoot tr th,tfoot tr th p{color:rgba(0,0,0,.8);font-weight:bold} |
| p.tableblock>code:only-child{background:none;padding:0} |
| p.tableblock{font-size:1em} |
| ol{margin-left:1.75em} |
| ul li ol{margin-left:1.5em} |
| dl dd{margin-left:1.125em} |
| dl dd:last-child,dl dd:last-child>:last-child{margin-bottom:0} |
| li p,ul dd,ol dd,.olist .olist,.ulist .ulist,.ulist .olist,.olist .ulist{margin-bottom:.625em} |
| ul.checklist,ul.none,ol.none,ul.no-bullet,ol.no-bullet,ol.unnumbered,ul.unstyled,ol.unstyled{list-style-type:none} |
| ul.no-bullet,ol.no-bullet,ol.unnumbered{margin-left:.625em} |
| ul.unstyled,ol.unstyled{margin-left:0} |
| li>p:empty:only-child::before{content:"";display:inline-block} |
| ul.checklist>li>p:first-child{margin-left:-1em} |
| ul.checklist>li>p:first-child>.fa-square-o:first-child,ul.checklist>li>p:first-child>.fa-check-square-o:first-child{width:1.25em;font-size:.8em;position:relative;bottom:.125em} |
| ul.checklist>li>p:first-child>input[type=checkbox]:first-child{margin-right:.25em} |
| ul.inline{display:flex;flex-flow:row wrap;list-style:none;margin:0 0 .625em -1.25em} |
| ul.inline>li{margin-left:1.25em} |
| .unstyled dl dt{font-weight:400;font-style:normal} |
| ol.arabic{list-style-type:decimal} |
| ol.decimal{list-style-type:decimal-leading-zero} |
| ol.loweralpha{list-style-type:lower-alpha} |
| ol.upperalpha{list-style-type:upper-alpha} |
| ol.lowerroman{list-style-type:lower-roman} |
| ol.upperroman{list-style-type:upper-roman} |
| ol.lowergreek{list-style-type:lower-greek} |
| .hdlist>table,.colist>table{border:0;background:none} |
| .hdlist>table>tbody>tr,.colist>table>tbody>tr{background:none} |
| td.hdlist1,td.hdlist2{vertical-align:top;padding:0 .625em} |
| td.hdlist1{font-weight:bold;padding-bottom:1.25em} |
| td.hdlist2{word-wrap:anywhere} |
| .literalblock+.colist,.listingblock+.colist{margin-top:-.5em} |
| .colist td:not([class]):first-child{padding:.4em .75em 0;line-height:1;vertical-align:top} |
| .colist td:not([class]):first-child img{max-width:none} |
| .colist td:not([class]):last-child{padding:.25em 0} |
| .thumb,.th{line-height:0;display:inline-block;border:4px solid #fff;box-shadow:0 0 0 1px #ddd} |
| .imageblock.left{margin:.25em .625em 1.25em 0} |
| .imageblock.right{margin:.25em 0 1.25em .625em} |
| .imageblock>.title{margin-bottom:0} |
| .imageblock.thumb,.imageblock.th{border-width:6px} |
| .imageblock.thumb>.title,.imageblock.th>.title{padding:0 .125em} |
| .image.left,.image.right{margin-top:.25em;margin-bottom:.25em;display:inline-block;line-height:0} |
| .image.left{margin-right:.625em} |
| .image.right{margin-left:.625em} |
| a.image{text-decoration:none;display:inline-block} |
| a.image object{pointer-events:none} |
| sup.footnote,sup.footnoteref{font-size:.875em;position:static;vertical-align:super} |
| sup.footnote a,sup.footnoteref a{text-decoration:none} |
| sup.footnote a:active,sup.footnoteref a:active,#footnotes .footnote a:first-of-type:active{text-decoration:underline} |
| #footnotes{padding-top:.75em;padding-bottom:.75em;margin-bottom:.625em} |
| #footnotes hr{width:20%;min-width:6.25em;margin:-.25em 0 .75em;border-width:1px 0 0} |
| #footnotes .footnote{padding:0 .375em 0 .225em;line-height:1.3334;font-size:.875em;margin-left:1.2em;margin-bottom:.2em} |
| #footnotes .footnote a:first-of-type{font-weight:bold;text-decoration:none;margin-left:-1.05em} |
| #footnotes .footnote:last-of-type{margin-bottom:0} |
| #content #footnotes{margin-top:-.625em;margin-bottom:0;padding:.75em 0} |
| div.unbreakable{page-break-inside:avoid} |
| .big{font-size:larger} |
| .small{font-size:smaller} |
| .underline{text-decoration:underline} |
| .overline{text-decoration:overline} |
| .line-through{text-decoration:line-through} |
| .aqua{color:#00bfbf} |
| .aqua-background{background:#00fafa} |
| .black{color:#000} |
| .black-background{background:#000} |
| .blue{color:#0000bf} |
| .blue-background{background:#0000fa} |
| .fuchsia{color:#bf00bf} |
| .fuchsia-background{background:#fa00fa} |
| .gray{color:#606060} |
| .gray-background{background:#7d7d7d} |
| .green{color:#006000} |
| .green-background{background:#007d00} |
| .lime{color:#00bf00} |
| .lime-background{background:#00fa00} |
| .maroon{color:#600000} |
| .maroon-background{background:#7d0000} |
| .navy{color:#000060} |
| .navy-background{background:#00007d} |
| .olive{color:#606000} |
| .olive-background{background:#7d7d00} |
| .purple{color:#600060} |
| .purple-background{background:#7d007d} |
| .red{color:#bf0000} |
| .red-background{background:#fa0000} |
| .silver{color:#909090} |
| .silver-background{background:#bcbcbc} |
| .teal{color:#006060} |
| .teal-background{background:#007d7d} |
| .white{color:#bfbfbf} |
| .white-background{background:#fafafa} |
| .yellow{color:#bfbf00} |
| .yellow-background{background:#fafa00} |
| span.icon>.fa{cursor:default} |
| a span.icon>.fa{cursor:inherit} |
| .admonitionblock td.icon [class^="fa icon-"]{font-size:2.5em;text-shadow:1px 1px 2px rgba(0,0,0,.5);cursor:default} |
| .admonitionblock td.icon .icon-note::before{content:"\f05a";color:#19407c} |
| .admonitionblock td.icon .icon-tip::before{content:"\f0eb";text-shadow:1px 1px 2px rgba(155,155,0,.8);color:#111} |
| .admonitionblock td.icon .icon-warning::before{content:"\f071";color:#bf6900} |
| .admonitionblock td.icon .icon-caution::before{content:"\f06d";color:#bf3400} |
| .admonitionblock td.icon .icon-important::before{content:"\f06a";color:#bf0000} |
| .conum[data-value]{display:inline-block;color:#fff!important;background:rgba(0,0,0,.8);border-radius:50%;text-align:center;font-size:.75em;width:1.67em;height:1.67em;line-height:1.67em;font-family:"Open Sans","DejaVu Sans",sans-serif;font-style:normal;font-weight:bold} |
| .conum[data-value] *{color:#fff!important} |
| .conum[data-value]+b{display:none} |
| .conum[data-value]::after{content:attr(data-value)} |
| pre .conum[data-value]{position:relative;top:-.125em} |
| b.conum *{color:inherit!important} |
| .conum:not([data-value]):empty{display:none} |
| dt,th.tableblock,td.content,div.footnote{text-rendering:optimizeLegibility} |
| h1,h2,p,td.content,span.alt,summary{letter-spacing:-.01em} |
| p strong,td.content strong,div.footnote strong{letter-spacing:-.005em} |
| p,blockquote,dt,td.content,td.hdlist1,span.alt,summary{font-size:1.0625rem} |
| p{margin-bottom:1.25rem} |
| .sidebarblock p,.sidebarblock dt,.sidebarblock td.content,p.tableblock{font-size:1em} |
| .exampleblock>.content{background:#fffef7;border-color:#e0e0dc;box-shadow:0 1px 4px #e0e0dc} |
| .print-only{display:none!important} |
| @page{margin:1.25cm .75cm} |
| @media print{*{box-shadow:none!important;text-shadow:none!important} |
| html{font-size:80%} |
| a{color:inherit!important;text-decoration:underline!important} |
| a.bare,a[href^="#"],a[href^="mailto:"]{text-decoration:none!important} |
| a[href^="http:"]:not(.bare)::after,a[href^="https:"]:not(.bare)::after{content:"(" attr(href) ")";display:inline-block;font-size:.875em;padding-left:.25em} |
| abbr[title]{border-bottom:1px dotted} |
| abbr[title]::after{content:" (" attr(title) ")"} |
| pre,blockquote,tr,img,object,svg{page-break-inside:avoid} |
| thead{display:table-header-group} |
| svg{max-width:100%} |
| p,blockquote,dt,td.content{font-size:1em;orphans:3;widows:3} |
| h2,h3,#toctitle,.sidebarblock>.content>.title{page-break-after:avoid} |
| #header,#content,#footnotes,#footer{max-width:none} |
| #toc,.sidebarblock,.exampleblock>.content{background:none!important} |
| #toc{border-bottom:1px solid #dddddf!important;padding-bottom:0!important} |
| body.book #header{text-align:center} |
| body.book #header>h1:first-child{border:0!important;margin:2.5em 0 1em} |
| body.book #header .details{border:0!important;display:block;padding:0!important} |
| body.book #header .details span:first-child{margin-left:0!important} |
| body.book #header .details br{display:block} |
| body.book #header .details br+span::before{content:none!important} |
| body.book #toc{border:0!important;text-align:left!important;padding:0!important;margin:0!important} |
| body.book #toc,body.book #preamble,body.book h1.sect0,body.book .sect1>h2{page-break-before:always} |
| .listingblock code[data-lang]::before{display:block} |
| #footer{padding:0 .9375em} |
| .hide-on-print{display:none!important} |
| .print-only{display:block!important} |
| .hide-for-print{display:none!important} |
| .show-for-print{display:inherit!important}} |
| @media amzn-kf8,print{#header>h1:first-child{margin-top:1.25rem} |
| .sect1{padding:0!important} |
| .sect1+.sect1{border:0} |
| #footer{background:none} |
| #footer-text{color:rgba(0,0,0,.6);font-size:.9em}} |
| @media amzn-kf8{#header,#content,#footnotes,#footer{padding:0}} |
| </style> |
| </head> |
| <body class="article"> |
| <div id="header"> |
| <h1>Concerning Git’s Packing Heuristics</h1> |
| </div> |
| <div id="content"> |
| <div class="literalblock"> |
| <div class="content"> |
| <pre>Oh, here's a really stupid question:</pre> |
| </div> |
| </div> |
| <div class="literalblock"> |
| <div class="content"> |
| <pre> Where do I go |
| to learn the details |
| of Git's packing heuristics?</pre> |
| </div> |
| </div> |
| <div class="paragraph"> |
| <p>Be careful what you ask!</p> |
| </div> |
| <div class="paragraph"> |
| <p>Followers of the Git, please open the Git IRC Log and turn to |
| February 10, 2006.</p> |
| </div> |
| <div class="paragraph"> |
| <p>It’s a rare occasion, and we are joined by the King Git Himself, |
| Linus Torvalds (linus). Nathaniel Smith, (njs`), has the floor |
| and seeks enlightenment. Others are present, but silent.</p> |
| </div> |
| <div class="paragraph"> |
| <p>Let’s listen in!</p> |
| </div> |
| <div class="literalblock"> |
| <div class="content"> |
| <pre> <njs`> Oh, here's a really stupid question -- where do I go to |
| learn the details of Git's packing heuristics? google avails |
| me not, reading the source didn't help a lot, and wading |
| through the whole mailing list seems less efficient than any |
| of that.</pre> |
| </div> |
| </div> |
| <div class="paragraph"> |
| <p>It is a bold start! A plea for help combined with a simultaneous |
| tri-part attack on some of the tried and true mainstays in the quest |
| for enlightenment. Brash accusations of google being useless. Hubris! |
| Maligning the source. Heresy! Disdain for the mailing list archives. |
| Woe.</p> |
| </div> |
| <div class="literalblock"> |
| <div class="content"> |
| <pre><pasky> yes, the packing-related delta stuff is somewhat |
| mysterious even for me ;)</pre> |
| </div> |
| </div> |
| <div class="paragraph"> |
| <p>Ah! Modesty after all.</p> |
| </div> |
| <div class="literalblock"> |
| <div class="content"> |
| <pre> <linus> njs, I don't think the docs exist. That's something where |
| I don't think anybody else than me even really got involved. |
| Most of the rest of Git others have been busy with (especially |
| Junio), but packing nobody touched after I did it.</pre> |
| </div> |
| </div> |
| <div class="paragraph"> |
| <p>It’s cryptic, yet vague. Linus in style for sure. Wise men |
| interpret this as an apology. A few argue it is merely a |
| statement of fact.</p> |
| </div> |
| <div class="literalblock"> |
| <div class="content"> |
| <pre><njs`> I guess the next step is "read the source again", but I |
| have to build up a certain level of gumption first :-)</pre> |
| </div> |
| </div> |
| <div class="paragraph"> |
| <p>Indeed! On both points.</p> |
| </div> |
| <div class="literalblock"> |
| <div class="content"> |
| <pre><linus> The packing heuristic is actually really really simple.</pre> |
| </div> |
| </div> |
| <div class="paragraph"> |
| <p>Bait…​</p> |
| </div> |
| <div class="literalblock"> |
| <div class="content"> |
| <pre><linus> But strange.</pre> |
| </div> |
| </div> |
| <div class="paragraph"> |
| <p>And switch. That ought to do it!</p> |
| </div> |
| <div class="literalblock"> |
| <div class="content"> |
| <pre><linus> Remember: Git really doesn't follow files. So what it does is |
| - generate a list of all objects |
| - sort the list according to magic heuristics |
| - walk the list, using a sliding window, seeing if an object |
| can be diffed against another object in the window |
| - write out the list in recency order</pre> |
| </div> |
| </div> |
| <div class="paragraph"> |
| <p>The traditional understatement:</p> |
| </div> |
| <div class="literalblock"> |
| <div class="content"> |
| <pre><njs`> I suspect that what I'm missing is the precise definition of |
| the word "magic"</pre> |
| </div> |
| </div> |
| <div class="paragraph"> |
| <p>The traditional insight:</p> |
| </div> |
| <div class="literalblock"> |
| <div class="content"> |
| <pre><pasky> yes</pre> |
| </div> |
| </div> |
| <div class="paragraph"> |
| <p>And Babel-like confusion flowed.</p> |
| </div> |
| <div class="literalblock"> |
| <div class="content"> |
| <pre><njs`> oh, hmm, and I'm not sure what this sliding window means either</pre> |
| </div> |
| </div> |
| <div class="literalblock"> |
| <div class="content"> |
| <pre><pasky> iirc, it appeared to me to be just the sha1 of the object |
| when reading the code casually ...</pre> |
| </div> |
| </div> |
| <div class="olist lowerroman"> |
| <ol class="lowerroman" type="i"> |
| <li> |
| <p>which simply doesn’t sound as a very good heuristics, though ;)</p> |
| <div class="literalblock"> |
| <div class="content"> |
| <pre><njs`> .....and recency order. okay, I think it's clear I didn't |
| even realize how much I wasn't realizing :-)</pre> |
| </div> |
| </div> |
| </li> |
| </ol> |
| </div> |
| <div class="paragraph"> |
| <p>Ah, grasshopper! And thus the enlightenment begins anew.</p> |
| </div> |
| <div class="literalblock"> |
| <div class="content"> |
| <pre> <linus> The "magic" is actually in theory totally arbitrary. |
| ANY order will give you a working pack, but no, it's not |
| ordered by SHA-1.</pre> |
| </div> |
| </div> |
| <div class="literalblock"> |
| <div class="content"> |
| <pre>Before talking about the ordering for the sliding delta |
| window, let's talk about the recency order. That's more |
| important in one way.</pre> |
| </div> |
| </div> |
| <div class="literalblock"> |
| <div class="content"> |
| <pre><njs`> Right, but if all you want is a working way to pack things |
| together, you could just use cat and save yourself some |
| trouble...</pre> |
| </div> |
| </div> |
| <div class="paragraph"> |
| <p>Waaait for it…​.</p> |
| </div> |
| <div class="literalblock"> |
| <div class="content"> |
| <pre><linus> The recency ordering (which is basically: put objects |
| _physically_ into the pack in the order that they are |
| "reachable" from the head) is important.</pre> |
| </div> |
| </div> |
| <div class="literalblock"> |
| <div class="content"> |
| <pre><njs`> okay</pre> |
| </div> |
| </div> |
| <div class="literalblock"> |
| <div class="content"> |
| <pre><linus> It's important because that's the thing that gives packs |
| good locality. It keeps the objects close to the head (whether |
| they are old or new, but they are _reachable_ from the head) |
| at the head of the pack. So packs actually have absolutely |
| _wonderful_ IO patterns.</pre> |
| </div> |
| </div> |
| <div class="paragraph"> |
| <p>Read that again, because it is important.</p> |
| </div> |
| <div class="literalblock"> |
| <div class="content"> |
| <pre><linus> But recency ordering is totally useless for deciding how |
| to actually generate the deltas, so the delta ordering is |
| something else.</pre> |
| </div> |
| </div> |
| <div class="literalblock"> |
| <div class="content"> |
| <pre>The delta ordering is (wait for it): |
| - first sort by the "basename" of the object, as defined by |
| the name the object was _first_ reached through when |
| generating the object list |
| - within the same basename, sort by size of the object |
| - but always sort different types separately (commits first).</pre> |
| </div> |
| </div> |
| <div class="literalblock"> |
| <div class="content"> |
| <pre>That's not exactly it, but it's very close.</pre> |
| </div> |
| </div> |
| <div class="literalblock"> |
| <div class="content"> |
| <pre><njs`> The "_first_ reached" thing is not too important, just you |
| need some way to break ties since the same objects may be |
| reachable many ways, yes?</pre> |
| </div> |
| </div> |
| <div class="paragraph"> |
| <p>And as if to clarify:</p> |
| </div> |
| <div class="literalblock"> |
| <div class="content"> |
| <pre><linus> The point is that it's all really just any random |
| heuristic, and the ordering is totally unimportant for |
| correctness, but it helps a lot if the heuristic gives |
| "clumping" for things that are likely to delta well against |
| each other.</pre> |
| </div> |
| </div> |
| <div class="paragraph"> |
| <p>It is an important point, so secretly, I did my own research and have |
| included my results below. To be fair, it has changed some over time. |
| And through the magic of Revisionistic History, I draw upon this entry |
| from The Git IRC Logs on my father’s birthday, March 1:</p> |
| </div> |
| <div class="literalblock"> |
| <div class="content"> |
| <pre> <gitster> The quote from the above linus should be rewritten a |
| bit (wait for it): |
| - first sort by type. Different objects never delta with |
| each other. |
| - then sort by filename/dirname. hash of the basename |
| occupies the top BITS_PER_INT-DIR_BITS bits, and bottom |
| DIR_BITS are for the hash of leading path elements. |
| - then if we are doing "thin" pack, the objects we are _not_ |
| going to pack but we know about are sorted earlier than |
| other objects. |
| - and finally sort by size, larger to smaller.</pre> |
| </div> |
| </div> |
| <div class="paragraph"> |
| <p>In one swell-foop, clarification and obscurification! Nonetheless, |
| authoritative. Cryptic, yet concise. It even solicits notions of |
| quotes from The Source Code. Clearly, more study is needed.</p> |
| </div> |
| <div class="literalblock"> |
| <div class="content"> |
| <pre> <gitster> That's the sort order. What this means is: |
| - we do not delta different object types. |
| - we prefer to delta the objects with the same full path, but |
| allow files with the same name from different directories. |
| - we always prefer to delta against objects we are not going |
| to send, if there are some. |
| - we prefer to delta against larger objects, so that we have |
| lots of removals.</pre> |
| </div> |
| </div> |
| <div class="literalblock"> |
| <div class="content"> |
| <pre>The penultimate rule is for "thin" packs. It is used when |
| the other side is known to have such objects.</pre> |
| </div> |
| </div> |
| <div class="paragraph"> |
| <p>There it is again. "Thin" packs. I’m thinking to myself, "What |
| is a <em>thin</em> pack?" So I ask:</p> |
| </div> |
| <div class="literalblock"> |
| <div class="content"> |
| <pre><jdl> What is a "thin" pack?</pre> |
| </div> |
| </div> |
| <div class="literalblock"> |
| <div class="content"> |
| <pre><gitster> Use of --objects-edge to rev-list as the upstream of |
| pack-objects. The pack transfer protocol negotiates that.</pre> |
| </div> |
| </div> |
| <div class="paragraph"> |
| <p>Woo hoo! Cleared that <em>right</em> up!</p> |
| </div> |
| <div class="literalblock"> |
| <div class="content"> |
| <pre><gitster> There are two directions - push and fetch.</pre> |
| </div> |
| </div> |
| <div class="paragraph"> |
| <p>There! Did you see it? It is not <em>"push" and "pull"</em>! How often the |
| confusion has started here. So casually mentioned, too!</p> |
| </div> |
| <div class="literalblock"> |
| <div class="content"> |
| <pre><gitster> For push, git-send-pack invokes git-receive-pack on the |
| other end. The receive-pack says "I have up to these commits". |
| send-pack looks at them, and computes what are missing from |
| the other end. So "thin" could be the default there.</pre> |
| </div> |
| </div> |
| <div class="literalblock"> |
| <div class="content"> |
| <pre> In the other direction, fetch, git-fetch-pack and |
| git-clone-pack invokes git-upload-pack on the other end |
| (via ssh or by talking to the daemon).</pre> |
| </div> |
| </div> |
| <div class="literalblock"> |
| <div class="content"> |
| <pre>There are two cases: fetch-pack with -k and clone-pack is one, |
| fetch-pack without -k is the other. clone-pack and fetch-pack |
| with -k will keep the downloaded packfile without expanded, so |
| we do not use thin pack transfer. Otherwise, the generated |
| pack will have delta without base object in the same pack.</pre> |
| </div> |
| </div> |
| <div class="literalblock"> |
| <div class="content"> |
| <pre>But fetch-pack without -k will explode the received pack into |
| individual objects, so we automatically ask upload-pack to |
| give us a thin pack if upload-pack supports it.</pre> |
| </div> |
| </div> |
| <div class="paragraph"> |
| <p>OK then.</p> |
| </div> |
| <div class="paragraph"> |
| <p>Uh.</p> |
| </div> |
| <div class="paragraph"> |
| <p>Let’s return to the previous conversation still in progress.</p> |
| </div> |
| <div class="literalblock"> |
| <div class="content"> |
| <pre><njs`> and "basename" means something like "the tail of end of |
| path of file objects and dir objects, as per basename(3), and |
| we just declare all commit and tag objects to have the same |
| basename" or something?</pre> |
| </div> |
| </div> |
| <div class="paragraph"> |
| <p>Luckily, that too is a point that gitster clarified for us!</p> |
| </div> |
| <div class="paragraph"> |
| <p>If I might add, the trick is to make files that <em>might</em> be similar be |
| located close to each other in the hash buckets based on their file |
| names. It used to be that "foo/Makefile", "bar/baz/quux/Makefile" and |
| "Makefile" all landed in the same bucket due to their common basename, |
| "Makefile". However, now they land in "close" buckets.</p> |
| </div> |
| <div class="paragraph"> |
| <p>The algorithm allows not just for the <em>same</em> bucket, but for <em>close</em> |
| buckets to be considered delta candidates. The rationale is |
| essentially that files, like Makefiles, often have very similar |
| content no matter what directory they live in.</p> |
| </div> |
| <div class="literalblock"> |
| <div class="content"> |
| <pre><linus> I played around with different delta algorithms, and with |
| making the "delta window" bigger, but having too big of a |
| sliding window makes it very expensive to generate the pack: |
| you need to compare every object with a _ton_ of other objects.</pre> |
| </div> |
| </div> |
| <div class="literalblock"> |
| <div class="content"> |
| <pre>There are a number of other trivial heuristics too, which |
| basically boil down to "don't bother even trying to delta this |
| pair" if we can tell before-hand that the delta isn't worth it |
| (due to size differences, where we can take a previous delta |
| result into account to decide that "ok, no point in trying |
| that one, it will be worse").</pre> |
| </div> |
| </div> |
| <div class="literalblock"> |
| <div class="content"> |
| <pre>End result: packing is actually very size efficient. It's |
| somewhat CPU-wasteful, but on the other hand, since you're |
| really only supposed to do it maybe once a month (and you can |
| do it during the night), nobody really seems to care.</pre> |
| </div> |
| </div> |
| <div class="paragraph"> |
| <p>Nice Engineering Touch, there. Find when it doesn’t matter, and |
| proclaim it a non-issue. Good style too!</p> |
| </div> |
| <div class="literalblock"> |
| <div class="content"> |
| <pre><njs`> So, just to repeat to see if I'm following, we start by |
| getting a list of the objects we want to pack, we sort it by |
| this heuristic (basically lexicographically on the tuple |
| (type, basename, size)).</pre> |
| </div> |
| </div> |
| <div class="literalblock"> |
| <div class="content"> |
| <pre>Then we walk through this list, and calculate a delta of |
| each object against the last n (tunable parameter) objects, |
| and pick the smallest of these deltas.</pre> |
| </div> |
| </div> |
| <div class="paragraph"> |
| <p>Vastly simplified, but the essence is there!</p> |
| </div> |
| <div class="literalblock"> |
| <div class="content"> |
| <pre><linus> Correct.</pre> |
| </div> |
| </div> |
| <div class="literalblock"> |
| <div class="content"> |
| <pre><njs`> And then once we have picked a delta or fulltext to |
| represent each object, we re-sort by recency, and write them |
| out in that order.</pre> |
| </div> |
| </div> |
| <div class="literalblock"> |
| <div class="content"> |
| <pre><linus> Yup. Some other small details:</pre> |
| </div> |
| </div> |
| <div class="paragraph"> |
| <p>And of course there is the "Other Shoe" Factor too.</p> |
| </div> |
| <div class="literalblock"> |
| <div class="content"> |
| <pre><linus> - We limit the delta depth to another magic value (right |
| now both the window and delta depth magic values are just "10")</pre> |
| </div> |
| </div> |
| <div class="literalblock"> |
| <div class="content"> |
| <pre><njs`> Hrm, my intuition is that you'd end up with really _bad_ IO |
| patterns, because the things you want are near by, but to |
| actually reconstruct them you may have to jump all over in |
| random ways.</pre> |
| </div> |
| </div> |
| <div class="literalblock"> |
| <div class="content"> |
| <pre><linus> - When we write out a delta, and we haven't yet written |
| out the object it is a delta against, we write out the base |
| object first. And no, when we reconstruct them, we actually |
| get nice IO patterns, because: |
| - larger objects tend to be "more recent" (Linus' law: files grow) |
| - we actively try to generate deltas from a larger object to a |
| smaller one |
| - this means that the top-of-tree very seldom has deltas |
| (i.e. deltas in _practice_ are "backwards deltas")</pre> |
| </div> |
| </div> |
| <div class="paragraph"> |
| <p>Again, we should reread that whole paragraph. Not just because |
| Linus has slipped Linus’s Law in there on us, but because it is |
| important. Let’s make sure we clarify some of the points here:</p> |
| </div> |
| <div class="literalblock"> |
| <div class="content"> |
| <pre><njs`> So the point is just that in practice, delta order and |
| recency order match each other quite well.</pre> |
| </div> |
| </div> |
| <div class="literalblock"> |
| <div class="content"> |
| <pre> <linus> Yes. There's another nice side to this (and yes, it was |
| designed that way ;): |
| - the reason we generate deltas against the larger object is |
| actually a big space saver too!</pre> |
| </div> |
| </div> |
| <div class="literalblock"> |
| <div class="content"> |
| <pre><njs`> Hmm, but your last comment (if "we haven't yet written out |
| the object it is a delta against, we write out the base object |
| first"), seems like it would make these facts mostly |
| irrelevant because even if in practice you would not have to |
| wander around much, in fact you just brute-force say that in |
| the cases where you might have to wander, don't do that :-)</pre> |
| </div> |
| </div> |
| <div class="literalblock"> |
| <div class="content"> |
| <pre><linus> Yes and no. Notice the rule: we only write out the base |
| object first if the delta against it was more recent. That |
| means that you can actually have deltas that refer to a base |
| object that is _not_ close to the delta object, but that only |
| happens when the delta is needed to generate an _old_ object.</pre> |
| </div> |
| </div> |
| <div class="literalblock"> |
| <div class="content"> |
| <pre><linus> See?</pre> |
| </div> |
| </div> |
| <div class="paragraph"> |
| <p>Yeah, no. I missed that on the first two or three readings myself.</p> |
| </div> |
| <div class="literalblock"> |
| <div class="content"> |
| <pre><linus> This keeps the front of the pack dense. The front of the |
| pack never contains data that isn't relevant to a "recent" |
| object. The size optimization comes from our use of xdelta |
| (but is true for many other delta algorithms): removing data |
| is cheaper (in size) than adding data.</pre> |
| </div> |
| </div> |
| <div class="literalblock"> |
| <div class="content"> |
| <pre> When you remove data, you only need to say "copy bytes n--m". |
| In contrast, in a delta that _adds_ data, you have to say "add |
| these bytes: 'actual data goes here'"</pre> |
| </div> |
| </div> |
| <div class="ulist"> |
| <ul> |
| <li> |
| <p>njs` has quit: Read error: 104 (Connection reset by peer)</p> |
| <div class="literalblock"> |
| <div class="content"> |
| <pre><linus> Uhhuh. I hope I didn't blow njs` mind.</pre> |
| </div> |
| </div> |
| </li> |
| <li> |
| <p>njs` has joined channel #git</p> |
| <div class="literalblock"> |
| <div class="content"> |
| <pre><pasky> :)</pre> |
| </div> |
| </div> |
| </li> |
| </ul> |
| </div> |
| <div class="paragraph"> |
| <p>The silent observers are amused. Of course.</p> |
| </div> |
| <div class="paragraph"> |
| <p>And as if njs` was expected to be omniscient:</p> |
| </div> |
| <div class="literalblock"> |
| <div class="content"> |
| <pre><linus> njs - did you miss anything?</pre> |
| </div> |
| </div> |
| <div class="paragraph"> |
| <p>OK, I’ll spell it out. That’s Geek Humor. If njs` was not actually |
| connected for a little bit there, how would he know if missed anything |
| while he was disconnected? He’s a benevolent dictator with a sense of |
| humor! Well noted!</p> |
| </div> |
| <div class="literalblock"> |
| <div class="content"> |
| <pre><njs`> Stupid router. Or gremlins, or whatever.</pre> |
| </div> |
| </div> |
| <div class="paragraph"> |
| <p>It’s a cheap shot at Cisco. Take 'em when you can.</p> |
| </div> |
| <div class="literalblock"> |
| <div class="content"> |
| <pre><njs`> Yes and no. Notice the rule: we only write out the base |
| object first if the delta against it was more recent.</pre> |
| </div> |
| </div> |
| <div class="literalblock"> |
| <div class="content"> |
| <pre> I'm getting lost in all these orders, let me re-read :-) |
| So the write-out order is from most recent to least recent? |
| (Conceivably it could be the opposite way too, I'm not sure if |
| we've said) though my connection back at home is logging, so I |
| can just read what you said there :-)</pre> |
| </div> |
| </div> |
| <div class="paragraph"> |
| <p>And for those of you paying attention, the Omniscient Trick has just |
| been detailed!</p> |
| </div> |
| <div class="literalblock"> |
| <div class="content"> |
| <pre><linus> Yes, we always write out most recent first</pre> |
| </div> |
| </div> |
| <div class="literalblock"> |
| <div class="content"> |
| <pre><njs`> And, yeah, I got the part about deeper-in-history stuff |
| having worse IO characteristics, one sort of doesn't care.</pre> |
| </div> |
| </div> |
| <div class="literalblock"> |
| <div class="content"> |
| <pre><linus> With the caveat that if the "most recent" needs an older |
| object to delta against (hey, shrinking sometimes does |
| happen), we write out the old object with the delta.</pre> |
| </div> |
| </div> |
| <div class="literalblock"> |
| <div class="content"> |
| <pre><njs`> (if only it happened more...)</pre> |
| </div> |
| </div> |
| <div class="literalblock"> |
| <div class="content"> |
| <pre> <linus> Anyway, the pack-file could easily be denser still, but |
| because it's used both for streaming (the Git protocol) and |
| for on-disk, it has a few pessimizations.</pre> |
| </div> |
| </div> |
| <div class="paragraph"> |
| <p>Actually, it is a made-up word. But it is a made-up word being |
| used as setup for a later optimization, which is a real word:</p> |
| </div> |
| <div class="literalblock"> |
| <div class="content"> |
| <pre><linus> In particular, while the pack-file is then compressed, |
| it's compressed just one object at a time, so the actual |
| compression factor is less than it could be in theory. But it |
| means that it's all nice random-access with a simple index to |
| do "object name->location in packfile" translation.</pre> |
| </div> |
| </div> |
| <div class="literalblock"> |
| <div class="content"> |
| <pre><njs`> I'm assuming the real win for delta-ing large->small is |
| more homogeneous statistics for gzip to run over?</pre> |
| </div> |
| </div> |
| <div class="literalblock"> |
| <div class="content"> |
| <pre>(You have to put the bytes in one place or another, but |
| putting them in a larger blob wins on compression)</pre> |
| </div> |
| </div> |
| <div class="literalblock"> |
| <div class="content"> |
| <pre>Actually, what is the compression strategy -- each delta |
| individually gzipped, the whole file gzipped, somewhere in |
| between, no compression at all, ....?</pre> |
| </div> |
| </div> |
| <div class="literalblock"> |
| <div class="content"> |
| <pre>Right.</pre> |
| </div> |
| </div> |
| <div class="paragraph"> |
| <p>Reality IRC sets in. For example:</p> |
| </div> |
| <div class="literalblock"> |
| <div class="content"> |
| <pre><pasky> I'll read the rest in the morning, I really have to go |
| sleep or there's no hope whatsoever for me at the today's |
| exam... g'nite all.</pre> |
| </div> |
| </div> |
| <div class="paragraph"> |
| <p>Heh.</p> |
| </div> |
| <div class="literalblock"> |
| <div class="content"> |
| <pre><linus> pasky: g'nite</pre> |
| </div> |
| </div> |
| <div class="literalblock"> |
| <div class="content"> |
| <pre><njs`> pasky: 'luck</pre> |
| </div> |
| </div> |
| <div class="literalblock"> |
| <div class="content"> |
| <pre><linus> Right: large->small matters exactly because of compression |
| behaviour. If it was non-compressed, it probably wouldn't make |
| any difference.</pre> |
| </div> |
| </div> |
| <div class="literalblock"> |
| <div class="content"> |
| <pre><njs`> yeah</pre> |
| </div> |
| </div> |
| <div class="literalblock"> |
| <div class="content"> |
| <pre><linus> Anyway: I'm not even trying to claim that the pack-files |
| are perfect, but they do tend to have a nice balance of |
| density vs ease-of use.</pre> |
| </div> |
| </div> |
| <div class="paragraph"> |
| <p>Gasp! OK, saved. That’s a fair Engineering trade off. Close call! |
| In fact, Linus reflects on some Basic Engineering Fundamentals, |
| design options, etc.</p> |
| </div> |
| <div class="literalblock"> |
| <div class="content"> |
| <pre><linus> More importantly, they allow Git to still _conceptually_ |
| never deal with deltas at all, and be a "whole object" store.</pre> |
| </div> |
| </div> |
| <div class="literalblock"> |
| <div class="content"> |
| <pre> Which has some problems (we discussed bad huge-file |
| behaviour on the Git lists the other day), but it does mean |
| that the basic Git concepts are really really simple and |
| straightforward.</pre> |
| </div> |
| </div> |
| <div class="literalblock"> |
| <div class="content"> |
| <pre>It's all been quite stable.</pre> |
| </div> |
| </div> |
| <div class="literalblock"> |
| <div class="content"> |
| <pre>Which I think is very much a result of having very simple |
| basic ideas, so that there's never any confusion about what's |
| going on.</pre> |
| </div> |
| </div> |
| <div class="literalblock"> |
| <div class="content"> |
| <pre>Bugs happen, but they are "simple" bugs. And bugs that |
| actually get some object store detail wrong are almost always |
| so obvious that they never go anywhere.</pre> |
| </div> |
| </div> |
| <div class="literalblock"> |
| <div class="content"> |
| <pre><njs`> Yeah.</pre> |
| </div> |
| </div> |
| <div class="paragraph"> |
| <p>Nuff said.</p> |
| </div> |
| <div class="literalblock"> |
| <div class="content"> |
| <pre> <linus> Anyway. I'm off for bed. It's not 6AM here, but I've got |
| three kids, and have to get up early in the morning to send |
| them off. I need my beauty sleep.</pre> |
| </div> |
| </div> |
| <div class="literalblock"> |
| <div class="content"> |
| <pre><njs`> :-)</pre> |
| </div> |
| </div> |
| <div class="literalblock"> |
| <div class="content"> |
| <pre> <njs`> appreciate the infodump, I really was failing to find the |
| details on Git packs :-)</pre> |
| </div> |
| </div> |
| <div class="paragraph"> |
| <p>And now you know the rest of the story.</p> |
| </div> |
| </div> |
| <div id="footer"> |
| <div id="footer-text"> |
| Last updated 2025-06-20 18:10:42 -0700 |
| </div> |
| </div> |
| </body> |
| </html> |