<?xml version="1.0" encoding="utf-8"?>
<feed xmlns="http://www.w3.org/2005/Atom">
  <title>LinkinStar&#39;s Blog</title>
  
  
  <link href="https://www.linkinstars.com/atom.xml" rel="self"/>
  
  <link href="https://www.linkinstars.com/"/>
  <updated>2026-05-09T10:13:37.571Z</updated>
  <id>https://www.linkinstars.com/</id>
  
  <author>
    <name>LinkinStar</name>
    
  </author>
  
  <generator uri="https://hexo.io/">Hexo</generator>
  
  <entry>
    <title>Claude Code 设计有哪些值得学习的地方</title>
    <link href="https://www.linkinstars.com/post/9c7a4cd1.html"/>
    <id>https://www.linkinstars.com/post/9c7a4cd1.html</id>
    <published>2026-05-08T16:00:00.000Z</published>
    <updated>2026-05-09T10:13:37.571Z</updated>
    
    <content type="html">
    <![CDATA[<a href="https://www.linkinstars.com/post/9c7a4cd1.html">RSS 阅读体验可能不太好，若喜欢本文请点此跳转原文查看~</a><br><br>]]>
    <![CDATA[<blockquote><p>本文非 AI 创作</p></blockquote><h2 id="前言"><a href="#前言" class="headerlink" title="前言"></a>前言</h2><p>现在网络上的文章都没人看，是因为很多内容不是人写的。<br>在网上找参考资料的时候我就发现大量的解析源码的文章全部都是 AI 写的，条条是到，然后观看量几乎没有，人一看到 1、2、3 这样的列举就头疼，现在在这样的时代下，坚持手动创作其实确实挺需要毅力的。<br><strong>本文主要是我在浏览和思考的过程中，总结的一些我自己在没有看之前想不到的地方。</strong></p><h2 id="循环本身"><a href="#循环本身" class="headerlink" title="循环本身"></a>循环本身</h2><p><img src="https://blog.linkinstars.com/blog/agent-loop-diagram.svg" alt="agent-loop-diagram"></p><p>首先循环如果真的简单说非常简单，其本质是工具的调用。为什么之前的模型只会对话？而现在加了 cc 之后就能工作了，答案就是工具。有了工具就有了手，就能干活了。而整个循环就是不断的在调用工具，直到不需要调用工具为止，工作就结束了。这个设计就像是人在写代码一样，人是先思考，然后再开始写，然后测试，发现不行回来再改再写，再测试再写，最终写完。AI 也是，思考要怎么做，使用哪些工具去做，做完之后要怎么检查等等。最终全部做完，不需要用工具了，只剩下思考总结一下就退出了。</p><p>所以 agentic loop 其实说简单就是这样而已。而其中更细节的更难的，其实是工具的调用和执行。</p><h2 id="多-agent-的好处"><a href="#多-agent-的好处" class="headerlink" title="多 agent 的好处"></a>多 agent 的好处</h2><p>如果让你直接想为什么需要多 agent 同时执行，你没有思考直接给到的想法肯定就一个字 “快”。但是除了快，还有别的什么吗？</p><p>我看了解析给我有几个印象深刻的地方：</p><ol><li>每个 agent 的<strong>工具不同，效率就能不一样</strong>。因为我们直到工具和 mcp 越多，越慢，因为 AI 要做出更多的选择，而由于多 agent 协同工作，每个都负责一定的“螺丝钉”的工作，就不需要那么多工具了，比如你只是单纯写代码，就不用联网去搜了，调研的交给别人去完成了。</li><li><strong>上下文</strong>，这个仔细自己想一下还是能想到的，因为上下文现在是有限制的，上下文太大会被压缩，而分给其他 agent 工作之后，细节就不需要管了，我不需要知道这个工作是如何完成的，我只要知道最后已经完成就可以了。最后做一个汇总就好了。</li><li>将通知作为用户消息。多 agent 协作是通过<strong>消息机制</strong>来沟通的，而对于大模型来说，并不用理解这种消息机制，而将其他 agent 派来的任务直接作为用户的一种行为输入就可以了，将机器命令作为人的命令去执行。这样也是完美覆盖了之前的循环的。</li></ol><p>整体实现其实并没有复杂，用的还是那些技术，甚至可以说，这样的协同工作和消息机制我好像在哪里看到过，没错，操作系统里面。</p><h2 id="为什么-Claude-code-看起来更智能"><a href="#为什么-Claude-code-看起来更智能" class="headerlink" title="为什么 Claude code 看起来更智能"></a>为什么 Claude code 看起来更智能</h2><p>至少从我使用角度上来说，即使使用相同的模型的情况下 cc 比其他 agent 会更智能。我也看到了其他对于 cc 和 oc(openclaw) 的讨论，包括论文，其实我觉得他们本质没什么不一样。</p><p>我的总结是：<strong>通过提供工具和调用链将 LLM 和目标结合为 flow 从而实现用户描述的任务。</strong></p><p>而区别就在与权限的控制、工具的提供、上下文的提供的等等其他方面。</p><p>那么为什么看起来完成任务的效果更好呢？我觉得其中很大一部分原因在于，<strong>用更少的上下文和更少的工具完成任务，有时候质量反而会更高</strong>。有的时候我觉得过多的 harness 就是跑非的关键，带上了一堆和任务无关的上下文就会影响它的工作效率和状态，而带上了一堆无用的工具就像是累赘让 AI 选择困难。</p><h2 id="思考"><a href="#思考" class="headerlink" title="思考"></a>思考</h2><p>对我们使用上来说有什么帮助吗？其实几乎没有，我一直相信好的 AI 工具是不需要人过多去介入和使用的，而看完之后给我更多的是对于 agent 设计的思考。</p><p>如果现在让我去设计一个新的 agent，抛开 agnet loop 的想法，获取我会用两个系统，一个是有限状态机将工作分成几个形态 agent 在这几个形态中运转，一个是事件总线所有的消息包括用户全部通过统一调度去分发，但依旧没有 AL 来的那么简单直接和优雅。</p><p>所以其实我们在设计 agent 的时候，<strong>如何让它更简单直接的去完成用户的任务</strong>，而不是让它去思考和选择太多的工具和上下文，才是我们应该去思考的。</p><h2 id="参考链接"><a href="#参考链接" class="headerlink" title="参考链接"></a>参考链接</h2><ul><li><a href="https://code.claude.com/docs/en/agent-sdk/agent-loop">https://code.claude.com/docs/en/agent-sdk/agent-loop</a></li><li><a href="https://arxiv.org/pdf/2604.14228">https://arxiv.org/pdf/2604.14228</a></li><li><a href="https://mp.weixin.qq.com/s?__biz=MzUxODAzNDg4NQ==&mid=2247557309&idx=1&sn=db872d9df4336797d2c364b5c4e4e880">https://mp.weixin.qq.com/s?__biz=MzUxODAzNDg4NQ==&amp;mid=2247557309&amp;idx=1&amp;sn=db872d9df4336797d2c364b5c4e4e880</a></li></ul>]]>
    </content>
    
    
      
      
        
        
    <summary type="html">&lt;blockquote&gt;
&lt;p&gt;本文非 AI 创作&lt;/p&gt;
&lt;/blockquote&gt;
&lt;h2 id=&quot;前言&quot;&gt;&lt;a href=&quot;#前言&quot; class=&quot;headerlink&quot;</summary>
        
      
    
    
    
    <category term="ai" scheme="https://www.linkinstars.com/categories/ai/"/>
    
    
    <category term="ai" scheme="https://www.linkinstars.com/tags/ai/"/>
    
  </entry>
  
  <entry>
    <title>2025 读书总结</title>
    <link href="https://www.linkinstars.com/post/8413bc25.html"/>
    <id>https://www.linkinstars.com/post/8413bc25.html</id>
    <published>2025-12-30T16:00:00.000Z</published>
    <updated>2026-02-04T09:04:22.557Z</updated>
    
    <content type="html">
    <![CDATA[<a href="https://www.linkinstars.com/post/8413bc25.html">RSS 阅读体验可能不太好，若喜欢本文请点此跳转原文查看~</a><br><br>]]>
    <![CDATA[<blockquote><p>2025 年度读书总结</p></blockquote><h2 id="读书列表"><a href="#读书列表" class="headerlink" title="读书列表"></a>读书列表</h2><blockquote><p>仅罗列，顺序是随机。怕链接会过期，失效的建议直接搜书名。</p></blockquote><h3 id="非小说"><a href="#非小说" class="headerlink" title="非小说"></a>非小说</h3><ul><li><a href="https://book.douban.com/subject/30333919/">《架构整洁之道》</a></li><li><a href="https://book.douban.com/subject/36071759/">《创造 : 用非传统方式做有价值的事》</a></li><li><a href="https://book.douban.com/subject/37015972/">《深入理解 Linux 进程与内存》</a></li><li><a href="https://book.douban.com/subject/1322025/">《卓有成效的管理者》</a></li><li><a href="https://book.douban.com/subject/35803094/">《System Design Interview, Vol.2》</a></li><li><a href="https://book.douban.com/subject/35609208/">《贝佐斯如何开会》</a></li><li><a href="https://book.douban.com/subject/1013208/">《如何阅读一本书》</a></li><li><a href="https://book.douban.com/subject/26877306/">《微习惯》</a></li><li><a href="https://book.douban.com/subject/3533221/">《非暴力沟通》</a></li><li><a href="https://book.douban.com/subject/36792034/">《从零开始的大语言模型原理与实践教程》</a></li><li><a href="https://book.douban.com/subject/37262793/">《StatQuest 图解机器学习》</a></li><li><a href="https://book.douban.com/subject/30133649/">《第五项修炼》</a></li><li><a href="https://book.douban.com/subject/37142217/">《黄仁勋：英伟达之芯》</a></li><li><a href="https://book.douban.com/subject/36673627/">《时势》</a></li><li><a href="https://book.douban.com/subject/37110620/">《财富方程式》</a></li></ul><h3 id="小说"><a href="#小说" class="headerlink" title="小说"></a>小说</h3><ul><li><a href="https://book.douban.com/subject/37036136/">《白夜追凶》</a></li><li><a href="https://book.douban.com/subject/36981461/">《凶案密码》</a></li><li><a href="https://book.douban.com/subject/30204297/">《夜行实录》</a></li><li><a href="https://book.douban.com/subject/30419261/">《天才在左，疯子在右》</a></li><li><a href="https://book.douban.com/subject/35837574/">《漂白》</a></li><li><a href="https://book.douban.com/subject/26818125/">《破绽》</a></li><li><a href="https://book.douban.com/subject/26392203/">《尸案调查科》</a></li></ul><h2 id="2025-最佳"><a href="#2025-最佳" class="headerlink" title="2025 最佳"></a>2025 最佳</h2><p><mark style="background: #FFB86CA6;">《贝佐斯如何开会》</mark></p><p>今年毫无疑问是这本书，因为这本书是真正付出实践的一本书。其中的给我影响的就是，选择什么要开的会，哪些会议是应该开的，哪些是不应该开的。然后就是 1 张纸的会议，将会议的内容浓缩到一张纸中，开会之前阅读。实际操作下来极大的提升了会议的效率。里面的内容让我去实际中可以运用和体验到改变，这本书就变得非常有意义了。</p><h2 id="新的思考"><a href="#新的思考" class="headerlink" title="新的思考"></a>新的思考</h2><p>今年由于 AI 的工具出现的可太多了，很多时候都在尝试各种工具所带来的改变，其实看的视频和其他内容比书要多不少。也造成了不少新的思考。有人问我，现在看书还有用吗？我的回答是一定的，看书仅仅是你的其中一种输入方式而已，现在有了那么多工具也就有了更多的更好的输入方式，但是“潜移默化”四个字我是一直相信的，很多你觉得很有道理的短视频或者 AI 的解释图片非常精美，但是可能你转头就忘记了，但是我看的很多书带给我的思考依然我还记忆犹新。<br>当然我看的书明年可能也会发生很大的变化，因为发现很多的基础亦或者底层原理的书要被我扔了，比如我现在就有几个 Rust 的语法书已经扔了。简单点说就是 “八股文” 的时代终于要结束了，我以前看的 Spring 源码都可以扔了，因为确实不会用了。而你留下的是什么呢？是其中的设计思想。<br>所以这也对于我后续的博客产生了一些变化，我可能会更少写一些教程或源码解析的类的博客了，会更多的关注一些设计、思路以及对于一些常见问题的解决方案，在现在这个时代下，我觉得设计是将会是后续的好路。如何针对一个场景给出自己的设计，如何快速评判 AI 给出的几个设计方案应用在现有场景的可行性。</p><p>总之，希望在新的一年里面，能找到属于自己的新的节奏和思路，把握这个美好的时代带给我们的风。</p>]]>
    </content>
    
    
      
      
        
        
    <summary type="html">&lt;blockquote&gt;
&lt;p&gt;2025 年度读书总结&lt;/p&gt;
&lt;/blockquote&gt;
&lt;h2 id=&quot;读书列表&quot;&gt;&lt;a href=&quot;#读书列表&quot; class=&quot;headerlink&quot;</summary>
        
      
    
    
    
    <category term="reading" scheme="https://www.linkinstars.com/categories/reading/"/>
    
    
    <category term="reading" scheme="https://www.linkinstars.com/tags/reading/"/>
    
  </entry>
  
  <entry>
    <title>设计万物 - 软删除与唯一索引的“相爱相杀”</title>
    <link href="https://www.linkinstars.com/post/daf75097.html"/>
    <id>https://www.linkinstars.com/post/daf75097.html</id>
    <published>2025-11-30T16:00:00.000Z</published>
    <updated>2026-01-29T08:11:40.282Z</updated>
    
    <content type="html">
    <![CDATA[<a href="https://www.linkinstars.com/post/daf75097.html">RSS 阅读体验可能不太好，若喜欢本文请点此跳转原文查看~</a><br><br>]]>
    <![CDATA[<h2 id="引言"><a href="#引言" class="headerlink" title="引言"></a>引言</h2><p>本地的设计来自于群里的一个小伙伴的提问：</p><blockquote><p>A: 请教一下，关于软删除，删除后新增同样的数据报 UNIQUE 索引冲突的问题，大家有没有什么好办法处理？<br>B: 重建唯一索引，原来字段和软删除时间戳或状态建立唯一索引<br>A: 恩，目前就是这样做的，看来只能重建索引了</p></blockquote><p>看到这个报错，“老司机”应该会心一笑，因为有这部分开发经验的同学应该经常遇到这个问题，而我对于上面 B 提到的使用 原字段 和 状态 建立唯一索引的方法更是觉得不妥，这几乎是每一位后端开发者在进阶路上都会踩到的坑——<strong>软删除（Soft Delete）与唯一索引（Unique Index）的冲突</strong>。</p><p>在现代 Web 开发中，为了数据安全、审计追踪以及防止手误，我们很少直接从数据库中 <code>DELETE</code> 物理删除数据，而是习惯使用一个标记位（如 <code>is_deleted</code>）来做“软删除”。然而，当业务需求遇上“用户名全局唯一”、“手机号全局唯一”这类硬性指标时，软删除和数据库层面的唯一约束就会发生剧烈的化学反应。</p><p>今天，我们就来聊聊这个经典话题：<strong>在软删除模式下，如何优雅地设计唯一索引？</strong></p><h2 id="软删除状态标记会出现什么问题？"><a href="#软删除状态标记会出现什么问题？" class="headerlink" title="软删除状态标记会出现什么问题？"></a>软删除状态标记会出现什么问题？</h2><blockquote><p>下面均以 <code>username</code> 作为唯一字段</p></blockquote><p>首先，如果单独给 <code>username</code>加唯一索引肯定不行，因为软删除的情况下，用户名还在，如果删除的用户名再次添加，就会报错。可能会出现后台查不到，用户也用不了的情况。既然单纯索引 <code>username</code> 不行，那把删除状态也加进去呢？</p><ul><li><strong>做法</strong>：建立联合索引 <code>UNIQUE KEY (username, is_deleted)</code>。</li><li><strong>结果</strong>：<ul><li><code>(&#39;Alice&#39;, 0)</code> 和 <code>(&#39;Alice&#39;, 1)</code> 可以共存。看似解决了问题。</li></ul></li><li><strong>致命缺陷</strong>：<ul><li>如果 Alice 注销（变成 1），重新注册（变成 0），再次注销（变成 1）……</li><li>第二次注销时，数据库里将会有两条 <code>(&#39;Alice&#39;, 1)</code>。</li><li><strong>报错！</strong> 唯一索引再次生效。这意味着<strong>一个用户名只能被“软删除”一次</strong>。这显然不符合现实业务需求。</li></ul></li></ul><h2 id="时间戳"><a href="#时间戳" class="headerlink" title="时间戳"></a>时间戳</h2><p>然后就有人会很容易想到使用时间作为删除的标记，那么因为每次删除的时间大概率不一样，所以使用删除时间和用户名作为联合唯一索引的话就能解决这个问题。不过坑点也有，那就是 <code>delete_at</code> 初始值不能设置为 NULL，因为 MySQL 对于带有 NULL 值的联合索引是有特别处理的，在 5.7 里面两个相同用户名均为 NULL 删除时间的情况是允许存在的。这也是为什么想要写这篇博客最大的原因，<strong>因为这个 bug 可能存在但可能一直不会被发现</strong>。而且我还遇到过，因为数据库的限制往往是最后一道防线，是为了避免并发条件下不会出现脏数据而设计的。而前置业务代码中肯定也会做用户名唯一性的校验和判断，所以大多数情况下会测不到，特别是注册还需要验证什么的。只有大量并发的时候才会出现。</p><ul><li><strong>做法</strong>：<ul><li>字段改为 <code>delete_at</code> (建议用 BigInt)。</li><li><strong>未删除</strong>：默认为 <code>0</code>（这也是关键点，尽量避免使用 NULL）。</li><li><strong>已删除</strong>：填入当前的毫秒级时间戳。</li><li><strong>索引</strong>：<code>UNIQUE KEY (username, delete_at)</code>。</li></ul></li><li><strong>演练</strong>：<ol><li>Alice 注册：<code>(&#39;Alice&#39;, 0)</code>。 -&gt; <strong>成功</strong></li><li>Alice 注销：更新为 <code>(&#39;Alice&#39;, 1678888888)</code>。 -&gt; <strong>成功</strong></li><li>Alice 重新注册：插入 <code>(&#39;Alice&#39;, 0)</code>。 -&gt; <strong>成功</strong>（因为 0 !&#x3D; 1678888888）</li><li>Alice 再次注销：更新为 <code>(&#39;Alice&#39;, 1679999999)</code>。 -&gt; <strong>成功</strong>（时间戳不同）</li></ol></li><li><strong>评价</strong>：简单、有效、能保留完整的删除历史。适合绝大多数业务场景。</li></ul><h2 id="归档表"><a href="#归档表" class="headerlink" title="归档表"></a>归档表</h2><p>当数据量极大，或者“历史数据”完全不需要参与日常业务查询时，物理隔离也是一种的选择。我曾经就也遇到过这样的情况，然后后续发现业务调整这部分的删除数据直接转移到另一张归档表中，一方面这部分数据可以被直接减少，极大的可以减轻主表的压力，而且这部分的归档数据往往只有在特殊情况下才会被审计，通常情况下可能永远就用不到了。</p><ul><li><strong>做法</strong>：<ul><li><code>users</code> 表：<strong>只存活跃用户</strong>。直接加 <code>UNIQUE KEY (username)</code>。</li><li><code>users_history</code> 表：存删除的用户，不加唯一约束。</li><li><strong>删除动作</strong>（只是做一个样例，实际可能是在业务代码中实现的）：<figure class="highlight plaintext"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br></pre></td><td class="code"><pre><span class="line">START TRANSACTION;</span><br><span class="line">INSERT INTO users_history SELECT * FROM users WHERE id = 1;</span><br><span class="line">DELETE FROM users WHERE id = 1;</span><br><span class="line">COMMIT;</span><br></pre></td></tr></table></figure></li></ul></li><li><strong>评价</strong>：主表极度瘦身，查询飞快。但代价是“删除”操作变重了，且如果要恢复数据（Undo），逻辑会比较麻烦。（当前这种恢复操作可能这辈子出现不了一次）</li></ul><h2 id="修改唯一字段"><a href="#修改唯一字段" class="headerlink" title="修改唯一字段"></a>修改唯一字段</h2><p>其实还有一种业务上也会用到的方法，那就是直接修改唯一字段，让唯一字段永远唯一。比如删除的时候，在软删除的基础之上，直接修改用户名，在用户名后面拼接上一个时间戳，这样也可以保证用户名的唯一，不过这也是建立在以后不需要查询删除和恢复数据的基础上。因为处理简单，所以也会被使用到。只要业务上允许就可以。</p><h2 id="总结"><a href="#总结" class="headerlink" title="总结"></a>总结</h2><p>之前我也会觉得一个软删除能有什么？但是实际中其实这种小的坑还是特别多的。而且特别是在国内的环境上，要求每个数据不能被永久删除，万一审计亦或是安全审查核对的时候，还有可能是排查意外问题等等情况下，所以在软删除被更多用到的时候，这样的问题就会容易出现。而正因为有朋友在群里面问了，我就发现这种坑特别容易坑到人。当然解决方案多种多样，我们只要选择自己合适的就好。</p>]]>
    </content>
    
    
      
      
        
        
    <summary type="html">&lt;h2 id=&quot;引言&quot;&gt;&lt;a href=&quot;#引言&quot; class=&quot;headerlink&quot; title=&quot;引言&quot;&gt;&lt;/a&gt;引言&lt;/h2&gt;&lt;p&gt;本地的设计来自于群里的一个小伙伴的提问：&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;A: 请教一下，关于软删除，删除后新增同样的数据报</summary>
        
      
    
    
    
    <category term="system-design" scheme="https://www.linkinstars.com/categories/system-design/"/>
    
    
    <category term="system-design" scheme="https://www.linkinstars.com/tags/system-design/"/>
    
  </entry>
  
  <entry>
    <title>watchtower 更新来自 AWS ECR 仓库的镜像</title>
    <link href="https://www.linkinstars.com/post/23aa97f0.html"/>
    <id>https://www.linkinstars.com/post/23aa97f0.html</id>
    <published>2025-11-14T16:00:00.000Z</published>
    <updated>2025-11-14T04:03:31.296Z</updated>
    
    <content type="html">
    <![CDATA[<a href="https://www.linkinstars.com/post/23aa97f0.html">RSS 阅读体验可能不太好，若喜欢本文请点此跳转原文查看~</a><br><br>]]>
    <![CDATA[<p>之前我已经在另一篇博客中介绍过了，在小项目的异步 cd 中表现不错，它可以帮助我们自动拉取最新的镜像并更新容器。</p><p>如果你正在使用 <strong>Watchtower</strong> 来自动更新 Docker 容器，并使用 <strong>Amazon ECR (Elastic Container Registry)</strong> 作为你的私有镜像仓库，你可能会遇到一个棘手的问题：<code>no basic auth credentials. Proceeding to next.</code></p><p>ECR 的身份验证机制是动态的，依赖于临时的 AWS 凭证，而不是静态的用户名和密码，所以需要一些额外的配置来帮助我们解决拉取 ECR 中镜像的问题。通过使用 <code>amazon-ecr-credential-helper</code>（一个 Docker 凭证助手）来解决这个问题。</p><h3 id="解决方案概览"><a href="#解决方案概览" class="headerlink" title="解决方案概览"></a>解决方案概览</h3><p>整体的解决方案其实官方在文档中已经给出，<a href="https://containrrr.dev/watchtower/private-registries/#credential_helpers">https://containrrr.dev/watchtower/private-registries/#credential_helpers</a></p><p>不过由于文档比较旧了，其中有一些部分需要调整一下。整体的策略是构建 <code>amazon-ecr-credential-helper</code> 二进制文件，将其存储在一个 Docker 卷（Volume）中，然后配置 Watchtower 容器来使用这个助手和相应的 Docker 配置文件。</p><h3 id="步骤一：构建凭证助手-Credential-Helper"><a href="#步骤一：构建凭证助手-Credential-Helper" class="headerlink" title="步骤一：构建凭证助手 (Credential Helper)"></a>步骤一：构建凭证助手 (Credential Helper)</h3><p>首先，我们需要编译这个助手。我们不会将它安装在主机上，而是使用一个临时的 Docker 容器来构建它，并将结果输出到一个共享卷中。</p><h4 id="创建-Dockerfile"><a href="#创建-Dockerfile" class="headerlink" title="创建 Dockerfile"></a>创建 Dockerfile</h4><p>创建一个名为 <code>Dockerfile</code> 的文件，内容如下</p><figure class="highlight dockerfile"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br></pre></td><td class="code"><pre><span class="line"><span class="keyword">FROM</span> golang:<span class="number">1.24</span></span><br><span class="line"></span><br><span class="line"><span class="keyword">ENV</span> CGO_ENABLED=<span class="number">0</span></span><br><span class="line"></span><br><span class="line"><span class="keyword">RUN</span><span class="language-bash"> go install github.com/awslabs/amazon-ecr-credential-helper/ecr-login/cli/docker-credential-ecr-login@latest</span></span><br><span class="line"></span><br><span class="line"><span class="keyword">WORKDIR</span><span class="language-bash"> /go/bin/</span></span><br></pre></td></tr></table></figure><blockquote><p>注意这里的内容就和官方的不一样，官方的镜像版本太老，导致无法正确安装</p></blockquote><h4 id="构建并将二进制文件存入卷"><a href="#构建并将二进制文件存入卷" class="headerlink" title="构建并将二进制文件存入卷"></a>构建并将二进制文件存入卷</h4><p>运行以下命令来创建卷、构建镜像，并运行一个临时容器将编译好的 <code>docker-credential-ecr-login</code> 二进制文件复制到卷中：</p><figure class="highlight bash"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment"># 1. 创建一个用于存储二进制文件的卷</span></span><br><span class="line">$ docker volume create helper</span><br><span class="line"></span><br><span class="line"><span class="comment"># 2. 使用上面的 Dockerfile 构建镜像</span></span><br><span class="line">$ docker build -t aws-ecr-dock-cred-helper .</span><br><span class="line"></span><br><span class="line"><span class="comment"># 3. 运行容器，将 /go/bin (包含二进制文件) 挂载到 helper 卷</span></span><br><span class="line"><span class="comment"># 容器启动后即退出，但二进制文件已保存在卷中</span></span><br><span class="line">$ docker run  -d --<span class="built_in">rm</span> --name aws-cred-helper --volume helper:/go/bin aws-ecr-dock-cred-helper</span><br><span class="line"></span><br><span class="line"><span class="comment"># 4. 然后查看是否存在 helper，如果存在就表示已经可以了</span></span><br><span class="line">$ docker volume <span class="built_in">ls</span></span><br></pre></td></tr></table></figure><h3 id="步骤二：配置-Docker-凭证"><a href="#步骤二：配置-Docker-凭证" class="headerlink" title="步骤二：配置 Docker 凭证"></a>步骤二：配置 Docker 凭证</h3><p>在你的工作目录（例如与 <code>docker-compose.yml</code> 同级）创建一个 <code>.docker/config.json</code> 文件。</p><p><strong>注意</strong>：请将 <code>&lt;AWS_ACCOUNT_ID&gt;</code> 和 <code>&lt;AWS_ECR_REGION&gt;</code> 替换为你的实际 AWS 账户 ID 和 ECR 区域。</p><figure class="highlight json"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br></pre></td><td class="code"><pre><span class="line"><span class="punctuation">&#123;</span></span><br><span class="line">    <span class="attr">&quot;credsStore&quot;</span> <span class="punctuation">:</span> <span class="string">&quot;ecr-login&quot;</span><span class="punctuation">,</span></span><br><span class="line">    <span class="attr">&quot;HttpHeaders&quot;</span> <span class="punctuation">:</span> <span class="punctuation">&#123;</span></span><br><span class="line">       <span class="attr">&quot;User-Agent&quot;</span> <span class="punctuation">:</span> <span class="string">&quot;Docker-Client/19.03.1 (XXXXXX)&quot;</span></span><br><span class="line">    <span class="punctuation">&#125;</span><span class="punctuation">,</span></span><br><span class="line">    <span class="attr">&quot;auths&quot;</span> <span class="punctuation">:</span> <span class="punctuation">&#123;</span></span><br><span class="line">       <span class="attr">&quot;&lt;AWS_ACCOUNT_ID&gt;.dkr.ecr.&lt;AWS_ECR_REGION&gt;.amazonaws.com&quot;</span> <span class="punctuation">:</span> <span class="punctuation">&#123;</span><span class="punctuation">&#125;</span></span><br><span class="line">    <span class="punctuation">&#125;</span><span class="punctuation">,</span></span><br><span class="line">    <span class="attr">&quot;credHelpers&quot;</span><span class="punctuation">:</span> <span class="punctuation">&#123;</span></span><br><span class="line">       <span class="attr">&quot;&lt;AWS_ACCOUNT_ID&gt;.dkr.ecr.&lt;AWS_ECR_REGION&gt;.amazonaws.com&quot;</span> <span class="punctuation">:</span> <span class="string">&quot;ecr-login&quot;</span></span><br><span class="line">    <span class="punctuation">&#125;</span></span><br><span class="line"><span class="punctuation">&#125;</span></span><br></pre></td></tr></table></figure><h3 id="步骤三：配置-Watchtower-Docker-Compose"><a href="#步骤三：配置-Watchtower-Docker-Compose" class="headerlink" title="步骤三：配置 Watchtower (Docker Compose)"></a>步骤三：配置 Watchtower (Docker Compose)</h3><p>现在我们将所有部分组合在一起。创建一个 <code>docker-compose.yml</code> 文件来启动 Watchtower。</p><figure class="highlight yaml"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br><span class="line">29</span><br><span class="line">30</span><br></pre></td><td class="code"><pre><span class="line"><span class="attr">services:</span></span><br><span class="line">  <span class="attr">myapp:</span></span><br><span class="line">    <span class="attr">image:</span> <span class="number">1651561651.</span><span class="string">dkr.ecr.us-east-1.amazonaws.com/myapp:latest</span></span><br><span class="line">    <span class="attr">container_name:</span> <span class="string">myapp</span></span><br><span class="line">    <span class="attr">restart:</span> <span class="string">unless-stopped</span></span><br><span class="line">    <span class="comment"># 通过配置 label 能告诉 watchtower 需要关注哪些镜像的更新</span></span><br><span class="line">    <span class="attr">labels:</span></span><br><span class="line">      <span class="bullet">-</span> <span class="string">&quot;com.centurylinklabs.watchtower.enable=true&quot;</span></span><br><span class="line">  <span class="attr">watchtower:</span></span><br><span class="line">    <span class="attr">image:</span> <span class="string">containrrr/watchtower</span></span><br><span class="line">    <span class="attr">restart:</span> <span class="string">unless-stopped</span></span><br><span class="line">    <span class="attr">volumes:</span></span><br><span class="line">      <span class="bullet">-</span> <span class="string">/var/run/docker.sock:/var/run/docker.sock</span></span><br><span class="line">      <span class="bullet">-</span> <span class="string">./config.json:/config.json</span></span><br><span class="line">      <span class="bullet">-</span> <span class="string">helper:/go/bin</span></span><br><span class="line">    <span class="attr">environment:</span></span><br><span class="line">      <span class="bullet">-</span> <span class="string">HOME=/</span></span><br><span class="line">      <span class="bullet">-</span> <span class="string">PATH=$PATH:/go/bin</span></span><br><span class="line">      <span class="bullet">-</span> <span class="string">WATCHTOWER_POLL_INTERVAL=10</span> <span class="comment"># 间隔时间 10s 用于测试，实际可以根据所需调整</span></span><br><span class="line">      <span class="bullet">-</span> <span class="string">WATCHTOWER_MONITOR_ONLY=true</span></span><br><span class="line">      <span class="bullet">-</span> <span class="string">WATCHTOWER_LABEL_ENABLE=true</span></span><br><span class="line">      <span class="bullet">-</span> <span class="string">WATCHTOWER_CLEANUP=true</span></span><br><span class="line">      <span class="bullet">-</span> <span class="string">AWS_REGION=us-east-1</span></span><br><span class="line">      <span class="bullet">-</span> <span class="string">AWS_ACCESS_KEY_ID=AKxxxxx</span></span><br><span class="line">      <span class="bullet">-</span> <span class="string">AWS_SECRET_ACCESS_KEY=xxxxx</span></span><br><span class="line"></span><br><span class="line"><span class="comment"># 最重要的是这个，需要将我们之前的 helper 挂载进去</span></span><br><span class="line"><span class="attr">volumes:</span></span><br><span class="line">  <span class="attr">helper:</span></span><br><span class="line">    <span class="attr">external:</span> <span class="literal">true</span></span><br></pre></td></tr></table></figure><p>然后，你就可以进行测试了，启动后可以查看 watchtower 的日志可以看到是否已经能正常的拉取镜像，而没有出现认证的错误了。</p><blockquote><p>需要注意的是，在实际的测试过程中发现有时候第一次部署的时候如果没有按照部署的步骤来操作，如果先部署了 watchtower 然后再重新挂载的镜像，一定要注意先删除实例 (<code>docker compose down</code>)，以及清空 helper <code>docker volume rm -f helper</code> 然后重新执行步骤，从而能避免无法正确挂载或正常执行的问题。</p></blockquote><h3 id="总结"><a href="#总结" class="headerlink" title="总结"></a>总结</h3><p>通过构建 <code>amazon-ecr-credential-helper</code> 并将其与 Watchtower 容器共享（通过 Docker 卷），同时提供正确的 Docker 配置文件和 AWS 环境变量，我们就设置了一个安全且自动化的流程。现在，Watchtower 可以监控你的 ECR 仓库，并在新镜像发布时自动更新你的服务。</p>]]>
    </content>
    
    
      
      
        
        
    <summary type="html">&lt;p&gt;之前我已经在另一篇博客中介绍过了，在小项目的异步 cd 中表现不错，它可以帮助我们自动拉取最新的镜像并更新容器。&lt;/p&gt;
&lt;p&gt;如果你正在使用 &lt;strong&gt;Watchtower&lt;/strong&gt; 来自动更新 Docker 容器，并使用 &lt;strong&gt;Amazon</summary>
        
      
    
    
    
    <category term="watchtower" scheme="https://www.linkinstars.com/categories/watchtower/"/>
    
    
    <category term="watchtower" scheme="https://www.linkinstars.com/tags/watchtower/"/>
    
  </entry>
  
  <entry>
    <title>设计万物 - 拖动列表排序</title>
    <link href="https://www.linkinstars.com/post/cc80f87b.html"/>
    <id>https://www.linkinstars.com/post/cc80f87b.html</id>
    <published>2025-09-30T16:00:00.000Z</published>
    <updated>2025-11-14T04:03:31.297Z</updated>
    
    <content type="html">
    <![CDATA[<a href="https://www.linkinstars.com/post/cc80f87b.html">RSS 阅读体验可能不太好，若喜欢本文请点此跳转原文查看~</a><br><br>]]>
    <![CDATA[<h2 id="场景"><a href="#场景" class="headerlink" title="场景"></a>场景</h2><p>在现代 Web 应用中，拖放排序功能为用户提供了直观、便捷的交互体验。无论是调整任务列表的优先级，还是重新排列相册中的图片，流畅的拖放操作总能极大地提升用户满意度。然而，一个看似简单的功能，其背后却需要一个高效、稳健的后端设计来支撑。</p><h2 id="思考"><a href="#思考" class="headerlink" title="思考"></a>思考</h2><p>在我最早写代码的时候，那个时候还没有那么流行 H5 前端技术也没有现在迭代那么好，拖动排序也还不流行，那个时候的排序是分为：上移、下移、置顶、置底这些操作。所以在那个时候，并没有什么问题，你只需要一个 <code>order</code> 值记录数据的顺序位置，并在点击按钮的时候执行对应操作进行整体排序亦或是简单的数据修改就可以了。比如：上移其实就是和前一条数据交换 <code>order</code> 值罢了。</p><p>随之技术发展，到了现在，拖动变成了一个非常容易实现的操作，从而问题就出现了。<strong>如何让一个数据放到任意一个位置呢？</strong></p><h2 id="方案"><a href="#方案" class="headerlink" title="方案"></a>方案</h2><h3 id="笨办法"><a href="#笨办法" class="headerlink" title="笨办法"></a>笨办法</h3><p>当年我的第一个方案就是前端传递这个数据的最终位置信息，比如将 A 移动到，2 和 3 数据之间。然后后端收到这个数据重新计算所有数据的 <code>order</code> 并更新。</p><figure class="highlight json"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line"><span class="punctuation">&#123;</span><span class="attr">&quot;id&quot;</span><span class="punctuation">:</span> <span class="number">100001</span><span class="punctuation">,</span> <span class="attr">&quot;pre_order&quot;</span><span class="punctuation">:</span> <span class="number">2</span><span class="punctuation">,</span> <span class="attr">&quot;post_order&quot;</span><span class="punctuation">:</span> <span class="number">3</span><span class="punctuation">&#125;</span></span><br></pre></td></tr></table></figure><p>这就和一个数组排序一样，将一个数据找到起点或终点，然后定位到所需位置，然后将之后的数据全部重新计算一次 index 索引之后重新得到新的顺序，这个方案可能是最直接能被想到的。可以简单叫 ”更新法“</p><p>到现在回想起来，其实我觉得这种方案在大多数情况下依旧是可用的，为什么？因为绝大多数需要进行拖动排序的场景不会有太大的数据量。仔细思考 🤔，如果你有一千条数据，你是不可能期望用户将一条数据从第一拖动到 500 的，这操作太麻烦了。多数场景下几十条数据之内是会采用这种交互的，而这个方案最大的问题就是可能一次性更新的记录数太多。不过也正由于场景中数据量不会太大，所以没什么问题。</p><p>但如果真的数据量比较多，又有调整顺序的需求呢？即便不是拖动的交互。</p><h3 id="新办法"><a href="#新办法" class="headerlink" title="新办法"></a>新办法</h3><p>为了解决全量更新法的性能问题，我们可以采用一种更为精妙的方法——取中值法。这种方法的核心思想是，为每个列表项分配一个排序值（可以是浮点数或一个很大的整数），当一个项目被移动到新的位置时，<strong>它的排序值将被设置为其前后两个项目排序值的中间值</strong>。</p><h4 id="实现逻辑"><a href="#实现逻辑" class="headerlink" title="实现逻辑"></a>实现逻辑</h4><p>我们同样需要一个排序字段，例如 <code>sort_key</code>，但这次它的类型通常是浮点数（<code>float</code> 或 <code>double</code>）或长整型（<code>bigint</code>）。为了避免一开始就出现小数，可以初始化时使用较大的整数间隔，例如 10000, 20000, 30000…</p><figure class="highlight json"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line"><span class="punctuation">&#123;</span><span class="attr">&quot;id&quot;</span><span class="punctuation">:</span> <span class="number">100001</span><span class="punctuation">,</span> <span class="attr">&quot;pre_id&quot;</span><span class="punctuation">:</span> <span class="number">100002</span><span class="punctuation">,</span> <span class="attr">&quot;post_id&quot;</span><span class="punctuation">:</span> <span class="number">100005</span><span class="punctuation">&#125;</span></span><br></pre></td></tr></table></figure><p>通过 <code>pre_id</code> 和 <code>post_id</code> 找到 <code>prev_sort_key</code> <code>next_sort_key</code>，然后计算这两个值的中间值：<code>new_sort_key = (prev_sort_key + next_sort_key) / 2</code>。</p><p>让我们直接看表格数据变化会更容易理解，下面展示的是任务 D 移动到 任务 A 和任务 B 之间的情况：</p><table><thead><tr><th><strong>项目</strong></th><th><strong>排序前顺序</strong></th><th><strong>排序前排序键 (Sort Key)</strong></th><th><strong>排序后顺序</strong></th><th><strong>排序后排序键 (Sort Key)</strong></th><th><strong>备注</strong></th></tr></thead><tbody><tr><td>任务 A</td><td>1</td><td>10000</td><td>1</td><td>10000</td><td>未移动</td></tr><tr><td>任务 B</td><td>2</td><td>20000</td><td>3</td><td>20000</td><td>未移动</td></tr><tr><td>任务 C</td><td>3</td><td>30000</td><td>4</td><td>30000</td><td>未移动</td></tr><tr><td>任务 D</td><td>4</td><td>40000</td><td>2</td><td><strong>15000</strong></td><td><strong>被移动项，排序键更新</strong></td></tr><tr><td>任务 E</td><td>5</td><td>50000</td><td>5</td><td>50000</td><td>未移动</td></tr></tbody></table><h4 id="特殊情况"><a href="#特殊情况" class="headerlink" title="特殊情况"></a>特殊情况</h4><p>这个方案也有几个特殊情况需要处理</p><ul><li><strong>移动到首位</strong>：如果项目被移动到列表的最前面，其新的 <code>sort_key</code> 可以是当前最小 <code>sort_key</code> 的一半，或者 <code>min_sort_key - 10000</code>（如果使用整数间隔）。</li><li><strong>移动到末尾</strong>：如果被移动到最后，其新的 <code>sort_key</code> 可以是当前最大 <code>sort_key</code> 再加上一个固定间隔，例如 <code>max_sort_key + 10000</code>。</li><li><strong>精度问题</strong>：如果频繁地在两个元素之间插入新元素，<code>sort_key</code> 的浮点数精度可能会耗尽，或者整数间隔会变得非常小。此时，需要一个“重排（rebalancing）”机制，当检测到排序间隔过小时，异步地重新计算并均匀分布整个列表的 <code>sort_key</code> 值。</li></ul><p>对于精度问题这通常是一个兜底的方案，实际中很少情况会出现经常拖动到同一个位置的情况，并且如果选择间隔为 65536 是能支持拖 16 次…</p><h4 id="优点"><a href="#优点" class="headerlink" title="优点"></a>优点</h4><p>这个方案的优点非常明显，就是拖动完成之后只需要更新一条数据就可以完成想要的排序操作，避免的频繁更新大量数据的问题，同时这个方案的实现代码会更简单，只需要简单的查询和计算即可，并且可以将两种特殊情况放入到基础情况中一并计算，代码会更加简洁。</p><h2 id="总结"><a href="#总结" class="headerlink" title="总结"></a>总结</h2><p>当然选择哪种拖动排序的设计方案，最终取决于您的具体业务需求。</p><ul><li>如果您的应用场景中列表规模可控，且用户不会进行过于频繁的排序操作，<strong>更新法</strong>以其简单性和可靠性成为一个不错的选择。</li><li>而对于追求极致性能和流畅用户体验的大型应用，<strong>取中值法</strong>无疑是更优的方案。虽然实现上稍显复杂，但它带来的性能提升和可扩展性是全量更新法无法比拟的。</li></ul><p>这作为第一篇，也就是我最想写的一篇，我的感觉是，如果我从没有听说过这个方案，那么我可能从工作开始到结束会一直用老办法用到底。我也是偶然在知乎上看到别人发帖才去仔细思考这个场景和他们提出的设计，最后用在了实际业务中发现确实好用。一旦你听过这个设计，那么会让系统的某个小部分更优雅，也正是这一个个小场景中的小设计其实是最容易被学会的。这种经验我觉得会比我们常说的系统设计来的更有意义。</p><h2 id="参考链接"><a href="#参考链接" class="headerlink" title="参考链接"></a>参考链接</h2><ul><li><a href="https://www.zhihu.com/question/55789722">https://www.zhihu.com/question/55789722</a></li><li><a href="https://gist.github.com/wonderbeyond/0516ed0ec5da78a40fbccf86149c229d">https://gist.github.com/wonderbeyond/0516ed0ec5da78a40fbccf86149c229d</a></li><li><a href="https://cloud.tencent.com/developer/article/2436900">https://cloud.tencent.com/developer/article/2436900</a></li></ul>]]>
    </content>
    
    
      
      
        
        
    <summary type="html">&lt;h2 id=&quot;场景&quot;&gt;&lt;a href=&quot;#场景&quot; class=&quot;headerlink&quot; title=&quot;场景&quot;&gt;&lt;/a&gt;场景&lt;/h2&gt;&lt;p&gt;在现代 Web</summary>
        
      
    
    
    
    <category term="system-design" scheme="https://www.linkinstars.com/categories/system-design/"/>
    
    
    <category term="system-design" scheme="https://www.linkinstars.com/tags/system-design/"/>
    
  </entry>
  
  <entry>
    <title>我不叫“工厂”，就不是工厂了吗？</title>
    <link href="https://www.linkinstars.com/post/faa07b5b.html"/>
    <id>https://www.linkinstars.com/post/faa07b5b.html</id>
    <published>2025-07-31T16:00:00.000Z</published>
    <updated>2026-01-29T06:54:35.425Z</updated>
    
    <content type="html">
    <![CDATA[<a href="https://www.linkinstars.com/post/faa07b5b.html">RSS 阅读体验可能不太好，若喜欢本文请点此跳转原文查看~</a><br><br>]]>
    <![CDATA[<h2 id="前言"><a href="#前言" class="headerlink" title="前言"></a>前言</h2><p>工厂模式（Factory Pattern）是设计模式中最常见、最基础的模式之一。往往在初学设计模式时，工厂模式是第一个接触到的模式。但在实际工作中，很多人对工厂模式的理解都停留在“有一个工厂类负责创建对象”这样的表面现象上，导致在使用时往往走入误区。</p><p>在实际工作之后慢慢积累经验，我发现工厂模式的本质其实并不在于“工厂”这个名字，所以我想带你重新思考一下设计模式的本质。</p><h2 id="工厂实现"><a href="#工厂实现" class="headerlink" title="工厂实现"></a>工厂实现</h2><blockquote><p>为了说明工厂模式的本质，我们先来看一个简单的代码例子</p></blockquote><p>比如，我们有一个“支付”场景：</p><figure class="highlight go"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br><span class="line">29</span><br><span class="line">30</span><br><span class="line">31</span><br><span class="line">32</span><br><span class="line">33</span><br><span class="line">34</span><br><span class="line">35</span><br><span class="line">36</span><br><span class="line">37</span><br><span class="line">38</span><br><span class="line">39</span><br><span class="line">40</span><br><span class="line">41</span><br><span class="line">42</span><br><span class="line">43</span><br><span class="line">44</span><br><span class="line">45</span><br><span class="line">46</span><br><span class="line">47</span><br><span class="line">48</span><br><span class="line">49</span><br><span class="line">50</span><br><span class="line">51</span><br><span class="line">52</span><br><span class="line">53</span><br><span class="line">54</span><br><span class="line">55</span><br><span class="line">56</span><br></pre></td><td class="code"><pre><span class="line"><span class="keyword">package</span> main</span><br><span class="line"></span><br><span class="line"><span class="keyword">import</span> (</span><br><span class="line"><span class="string">&quot;fmt&quot;</span></span><br><span class="line">)</span><br><span class="line"></span><br><span class="line"><span class="comment">// 1. 支付器接口 (产品的抽象)</span></span><br><span class="line"><span class="keyword">type</span> IPayment <span class="keyword">interface</span> &#123;</span><br><span class="line">Pay(amount <span class="type">float64</span>) <span class="type">string</span></span><br><span class="line">&#125;</span><br><span class="line"></span><br><span class="line"><span class="comment">// 2. 具体产品A：支付宝</span></span><br><span class="line"><span class="keyword">type</span> AliPay <span class="keyword">struct</span>&#123;&#125;</span><br><span class="line"></span><br><span class="line"><span class="function"><span class="keyword">func</span> <span class="params">(a *AliPay)</span></span> Pay(amount <span class="type">float64</span>) <span class="type">string</span> &#123;</span><br><span class="line"><span class="keyword">return</span> fmt.Sprintf(<span class="string">&quot;使用支付宝支付了 %.2f 元&quot;</span>, amount)</span><br><span class="line">&#125;</span><br><span class="line"></span><br><span class="line"><span class="comment">// 3. 具体产品B：微信支付</span></span><br><span class="line"><span class="keyword">type</span> WechatPay <span class="keyword">struct</span>&#123;&#125;</span><br><span class="line"></span><br><span class="line"><span class="function"><span class="keyword">func</span> <span class="params">(w *WechatPay)</span></span> Pay(amount <span class="type">float64</span>) <span class="type">string</span> &#123;</span><br><span class="line"><span class="keyword">return</span> fmt.Sprintf(<span class="string">&quot;使用微信支付了 %.2f 元&quot;</span>, amount)</span><br><span class="line">&#125;</span><br><span class="line"></span><br><span class="line"><span class="comment">// 4. “工厂”函数</span></span><br><span class="line"><span class="comment">// 它根据输入，决定“new”哪个具体的结构体</span></span><br><span class="line"><span class="function"><span class="keyword">func</span> <span class="title">NewPayment</span><span class="params">(payType <span class="type">string</span>)</span></span> (IPayment, <span class="type">error</span>) &#123;</span><br><span class="line"><span class="keyword">switch</span> payType &#123;</span><br><span class="line"><span class="keyword">case</span> <span class="string">&quot;ali&quot;</span>:</span><br><span class="line"><span class="keyword">return</span> &amp;AliPay&#123;&#125;, <span class="literal">nil</span></span><br><span class="line"><span class="keyword">case</span> <span class="string">&quot;wechat&quot;</span>:</span><br><span class="line"><span class="keyword">return</span> &amp;WechatPay&#123;&#125;, <span class="literal">nil</span></span><br><span class="line"><span class="keyword">default</span>:</span><br><span class="line"><span class="keyword">return</span> <span class="literal">nil</span>, fmt.Errorf(<span class="string">&quot;不支持的支付方式: %s&quot;</span>, payType)</span><br><span class="line">&#125;</span><br><span class="line">&#125;</span><br><span class="line"></span><br><span class="line"><span class="comment">// 5. 客户 (调用方)</span></span><br><span class="line"><span class="function"><span class="keyword">func</span> <span class="title">main</span><span class="params">()</span></span> &#123;</span><br><span class="line"><span class="comment">// 客户代码完全不认识 AliPay 或 WechatPay 结构体</span></span><br><span class="line"><span class="comment">// 它只认识 &quot;IPayment&quot; 接口和 &quot;NewPayment&quot; 工厂</span></span><br><span class="line">payment, err := NewPayment(<span class="string">&quot;ali&quot;</span>)</span><br><span class="line"><span class="keyword">if</span> err != <span class="literal">nil</span> &#123;</span><br><span class="line">fmt.Println(err)</span><br><span class="line"><span class="keyword">return</span></span><br><span class="line">&#125;</span><br><span class="line">fmt.Println(payment.Pay(<span class="number">100.0</span>))</span><br><span class="line"></span><br><span class="line">payment, err = NewPayment(<span class="string">&quot;wechat&quot;</span>)</span><br><span class="line"><span class="keyword">if</span> err != <span class="literal">nil</span> &#123;</span><br><span class="line">fmt.Println(err)</span><br><span class="line"><span class="keyword">return</span></span><br><span class="line">&#125;</span><br><span class="line">fmt.Println(payment.Pay(<span class="number">200.0</span>))</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure><h3 id="灵魂拷问：为什么要这么做？"><a href="#灵魂拷问：为什么要这么做？" class="headerlink" title="灵魂拷问：为什么要这么做？"></a>灵魂拷问：为什么要这么做？</h3><p>上面的代码，<code>main</code> 函数（客户）里为什么不直接这样写？</p><figure class="highlight go"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment">// 客户直接 &quot;new&quot;</span></span><br><span class="line">payment := &amp;AliPay&#123;&#125;</span><br><span class="line">fmt.Println(payment.Pay(<span class="number">100.0</span>))</span><br></pre></td></tr></table></figure><p>这样不是更简单吗？为什么非要搞一个 <code>NewPayment</code> 函数绕一下？</p><p><strong>答案是：为了“隔离”。</strong></p><p><code>main</code> 函数是“<strong>使用</strong>”支付功能的地方，而 <code>AliPay&#123;&#125;</code> 和 <code>WechatPay&#123;&#125;</code> 是“<strong>实现</strong>”支付功能的地方。</p><p>在没有工厂的情况下，“使用者”<code>main</code> <strong>强依赖</strong>于“实现者”<code>AliPay</code>。</p><ul><li>如果 <code>AliPay</code> 的创建方式变了（比如 <code>NewAliPay(config)</code>），<code>main</code> 函数就必须修改。</li><li>如果 <code>main</code> 函数想换成 <code>WechatPay</code>，<code>main</code> 函数也必须修改。</li></ul><p>而 <code>NewPayment</code> 函数，就是那个“<strong>隔离层</strong>”。</p><blockquote><p><strong>使用工厂的好处是：</strong><br>“使用者”（<code>main</code>）和“具体实现”（<code>AliPay</code>）解耦了。<br><code>main</code> 只依赖“抽象”的 <code>IPayment</code> 接口和“工厂” <code>NewPayment</code>。</p><p>至于 <code>NewPayment</code> 内部是用 <code>switch</code> 还是 <code>if</code>，是返回 <code>&amp;AliPay&#123;&#125;</code> 还是 <code>&amp;AliPay&#123;config: ...&#125;</code>，<code>main</code> 根本不关心。</p></blockquote><p>这，就是“<strong>抽象实现细节</strong>”的第一层。</p><h3 id="一个不叫“工厂”的实现"><a href="#一个不叫“工厂”的实现" class="headerlink" title="一个不叫“工厂”的实现"></a>一个不叫“工厂”的实现</h3><p><code>NewPayment</code> 函数虽然好，但它有个致命缺陷：<strong>违反了“开闭原则”</strong>。</p><p>如果我们要增加一种“银联支付”（<code>UnionPay</code>），我们<strong>必须</strong>去修改 <code>NewPayment</code> 函数的 <code>switch</code> 逻辑。</p><p>这在大型项目中往往会变得更加麻烦。我们希望的是，增加新功能时，<strong>不修改</strong>老代码。</p><p>我们来看一个在 Go 中更常见、更优雅的实现方式 —— <strong>“注册”</strong>。</p><figure class="highlight go"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br><span class="line">29</span><br><span class="line">30</span><br><span class="line">31</span><br><span class="line">32</span><br><span class="line">33</span><br><span class="line">34</span><br><span class="line">35</span><br><span class="line">36</span><br><span class="line">37</span><br><span class="line">38</span><br><span class="line">39</span><br><span class="line">40</span><br><span class="line">41</span><br><span class="line">42</span><br><span class="line">43</span><br><span class="line">44</span><br><span class="line">45</span><br><span class="line">46</span><br><span class="line">47</span><br><span class="line">48</span><br><span class="line">49</span><br><span class="line">50</span><br><span class="line">51</span><br><span class="line">52</span><br><span class="line">53</span><br><span class="line">54</span><br><span class="line">55</span><br><span class="line">56</span><br><span class="line">57</span><br><span class="line">58</span><br><span class="line">59</span><br><span class="line">60</span><br><span class="line">61</span><br><span class="line">62</span><br><span class="line">63</span><br></pre></td><td class="code"><pre><span class="line"><span class="keyword">package</span> main</span><br><span class="line"></span><br><span class="line"><span class="keyword">import</span> (</span><br><span class="line"><span class="string">&quot;fmt&quot;</span></span><br><span class="line">)</span><br><span class="line"></span><br><span class="line"><span class="comment">// --- 支付接口和具体实现 (这部分不变) ---</span></span><br><span class="line"><span class="keyword">type</span> IPayment <span class="keyword">interface</span> &#123;</span><br><span class="line">Pay(amount <span class="type">float64</span>) <span class="type">string</span></span><br><span class="line">&#125;</span><br><span class="line"><span class="keyword">type</span> AliPay <span class="keyword">struct</span>&#123;&#125;</span><br><span class="line"><span class="function"><span class="keyword">func</span> <span class="params">(a *AliPay)</span></span> Pay(amount <span class="type">float64</span>) <span class="type">string</span> &#123; <span class="comment">/* ... */</span> <span class="keyword">return</span> <span class="string">&quot;支付宝支付&quot;</span> &#125;</span><br><span class="line"><span class="keyword">type</span> WechatPay <span class="keyword">struct</span>&#123;&#125;</span><br><span class="line"><span class="function"><span class="keyword">func</span> <span class="params">(w *WechatPay)</span></span> Pay(amount <span class="type">float64</span>) <span class="type">string</span> &#123; <span class="comment">/* ... */</span> <span class="keyword">return</span> <span class="string">&quot;微信支付&quot;</span> &#125;</span><br><span class="line"></span><br><span class="line"><span class="comment">// --- 我们不再叫 &quot;Factory&quot; ---</span></span><br><span class="line"></span><br><span class="line"><span class="comment">// 1. 定义一个“创建者”的函数原型</span></span><br><span class="line"><span class="keyword">type</span> PaymentCreator <span class="function"><span class="keyword">func</span><span class="params">()</span></span> IPayment</span><br><span class="line"></span><br><span class="line"><span class="comment">// 2. 一个全局的“注册表”</span></span><br><span class="line"><span class="keyword">var</span> paymentRegistry = <span class="built_in">make</span>(<span class="keyword">map</span>[<span class="type">string</span>]PaymentCreator)</span><br><span class="line"></span><br><span class="line"><span class="comment">// 3. 注册函数：允许外部注册新的支付方式</span></span><br><span class="line"><span class="function"><span class="keyword">func</span> <span class="title">RegisterPayment</span><span class="params">(payType <span class="type">string</span>, creator PaymentCreator)</span></span> &#123;</span><br><span class="line"><span class="keyword">if</span> _, ok := paymentRegistry[payType]; ok &#123;</span><br><span class="line"><span class="comment">// 警告：重复注册</span></span><br><span class="line">&#125;</span><br><span class="line">paymentRegistry[payType] = creator</span><br><span class="line">&#125;</span><br><span class="line"></span><br><span class="line"><span class="comment">// 4. 获取函数：从注册表中获取实例</span></span><br><span class="line"><span class="comment">// 这个函数就是我们新的“工厂”，但它不叫“工厂”</span></span><br><span class="line"><span class="function"><span class="keyword">func</span> <span class="title">GetPayment</span><span class="params">(payType <span class="type">string</span>)</span></span> (IPayment, <span class="type">error</span>) &#123;</span><br><span class="line">creator, ok := paymentRegistry[payType]</span><br><span class="line"><span class="keyword">if</span> !ok &#123;</span><br><span class="line"><span class="keyword">return</span> <span class="literal">nil</span>, fmt.Errorf(<span class="string">&quot;不支持的支付方式: %s&quot;</span>, payType)</span><br><span class="line">&#125;</span><br><span class="line"><span class="comment">// &quot;创建&quot; 的动作在这里发生</span></span><br><span class="line"><span class="keyword">return</span> creator(), <span class="literal">nil</span></span><br><span class="line">&#125;</span><br><span class="line"></span><br><span class="line"><span class="comment">// --- 在不同的包中初始化 (模拟插件化) ---</span></span><br><span class="line"><span class="comment">// (在实际项目中，这会在 ali_pay.go 和 wechat_pay.go 的 init() 中完成)</span></span><br><span class="line"><span class="function"><span class="keyword">func</span> <span class="title">init</span><span class="params">()</span></span> &#123;</span><br><span class="line">RegisterPayment(<span class="string">&quot;ali&quot;</span>, <span class="function"><span class="keyword">func</span><span class="params">()</span></span> IPayment &#123; <span class="keyword">return</span> &amp;AliPay&#123;&#125; &#125;)</span><br><span class="line">RegisterPayment(<span class="string">&quot;wechat&quot;</span>, <span class="function"><span class="keyword">func</span><span class="params">()</span></span> IPayment &#123; <span class="keyword">return</span> &amp;WechatPay&#123;&#125; &#125;)</span><br><span class="line">&#125;</span><br><span class="line"></span><br><span class="line"><span class="comment">// --- 客户代码 ---</span></span><br><span class="line"><span class="function"><span class="keyword">func</span> <span class="title">main</span><span class="params">()</span></span> &#123;</span><br><span class="line"><span class="comment">// 客户现在只依赖 GetPayment</span></span><br><span class="line">payment, err := GetPayment(<span class="string">&quot;ali&quot;</span>)</span><br><span class="line"><span class="keyword">if</span> err != <span class="literal">nil</span> &#123;</span><br><span class="line">fmt.Println(err)</span><br><span class="line"><span class="keyword">return</span></span><br><span class="line">&#125;</span><br><span class="line">fmt.Println(payment.Pay(<span class="number">100.0</span>))</span><br><span class="line"></span><br><span class="line"><span class="comment">// 思考：如果我们想增加“银联支付”？</span></span><br><span class="line"><span class="comment">// 我们只需要新建 union_pay.go，并在其 init() 中调用 RegisterPayment</span></span><br><span class="line"><span class="comment">// GetPayment 和 main 函数完全不需要改动！</span></span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure><p>这个实现是不是很眼熟？</p><p><strong>Go语言标准库 <code>database/sql</code> 就是这么干的。</strong></p><ul><li><code>sql.Register(driverName string, driver driver.Driver)</code></li><li><code>sql.Open(driverName string, dataSourceName string)</code></li></ul><p><code>sql.Open</code> 就是那个“工厂”函数，它根本不知道有哪些数据库驱动（MySQL, PostgreSQL…），它只管去注册表里查 <code>driverName</code>。而各个驱动包通过 <code>init()</code> 函数把自己注册进去。</p><blockquote><p>注意并不是所有的情况都适合使用 init 方法去隐式注册（有的人不喜欢），那么显示注册也是可以的，只要能达到“使用者”和“实现者”解耦的目的即可。</p></blockquote><h3 id="我不叫“工厂”，就不是工厂了吗？"><a href="#我不叫“工厂”，就不是工厂了吗？" class="headerlink" title="我不叫“工厂”，就不是工厂了吗？"></a>我不叫“工厂”，就不是工厂了吗？</h3><p>现在，我们有了两个实现：</p><ol><li><strong><code>NewPayment</code><strong>：一个巨大的 <code>switch</code>，它</strong>知道所有</strong>具体实现。</li><li><strong><code>GetPayment</code><strong>：一个 <code>map</code> 和一个查找函数，它</strong>不知道任何</strong>具体实现，它只知道一个“注册表”。</li></ol><p>哪个是“工厂模式”？</p><p><strong>答案是：它们都是。</strong></p><p><code>NewPayment</code> 是一个“集中式”的工厂。<br><code>GetPayment</code> (以及 <code>RegisterPayment</code>) 是一个“注册式”的工厂。</p><p>这引出了我们最终的结论：</p><p><strong>工厂模式的本质，不是那个叫 <code>Factory</code> 的类或那个叫 <code>NewXxx</code> 的函数。</strong></p><p><strong>工厂模式的本质是：将“创建对象的具体过程”从“使用对象的地方”中剥离出来。</strong></p><p>它是一种<strong>抽象</strong>。</p><blockquote><p>它在“使用者”（如 <code>main</code> 函数）和“实现者”（如 <code>AliPay</code> 结构体）之间，建立了一个隔离层</p><p>至于是用 <code>switch</code>（简单工厂）、用“接口+实现”（工厂方法）、用“接口返回接口”（抽象工厂），还是用 <code>map</code>（注册器）来实现的……</p><p><strong>这都不重要。</strong></p></blockquote><p>这些只是“术”（实现方式），而“道”（本质）是<strong>抽象</strong>和<strong>隔离</strong>。</p><p>所以，下次当你在代码里看到一个“注册中心”（Registry）、一个“提供者”（Provider）、一个“管理器”（Manager）或一个“服务定位器”（Service Locator）时，如果它的核心职责是<strong>根据某些条件创建并返回一个抽象接口的实例</strong>，那么，它就是工厂模式。</p><p>它叫不叫“工厂”，真的无所谓。</p><h2 id="总结"><a href="#总结" class="headerlink" title="总结"></a>总结</h2><p>其实再说的直白一点，只要你抽象了实现细节，让使用者不需要关心具体实现类的创建过程，就是在使用工厂模式了。而“工厂”只是说你抽象之后是一个创建“产品”的说法而已。</p><p>所以，工厂模式的核心思想是<strong>抽象实现细节</strong>，而这个思想是让你的代码更加<strong>解耦</strong>和<strong>灵活</strong>的方法之一。也是程序设计中非常重要的一个原则。</p><p>在实际中我做过大量的 Code Review 其实都是在做抽象这一件事，这个优化真的很重要，所以希望通过这篇文章能让你对工厂模式有一个新的认识。</p>]]>
    </content>
    
    
      
      
        
        
    <summary type="html">&lt;h2 id=&quot;前言&quot;&gt;&lt;a href=&quot;#前言&quot; class=&quot;headerlink&quot; title=&quot;前言&quot;&gt;&lt;/a&gt;前言&lt;/h2&gt;&lt;p&gt;工厂模式（Factory</summary>
        
      
    
    
    
    <category term="design-pattern" scheme="https://www.linkinstars.com/categories/design-pattern/"/>
    
    
    <category term="design-pattern" scheme="https://www.linkinstars.com/tags/design-pattern/"/>
    
  </entry>
  
  <entry>
    <title>2025 Cursor Meetup Hangzhou 小记</title>
    <link href="https://www.linkinstars.com/post/12bae9d8.html"/>
    <id>https://www.linkinstars.com/post/12bae9d8.html</id>
    <published>2025-07-11T16:00:00.000Z</published>
    <updated>2025-07-14T10:45:19.113Z</updated>
    
    <content type="html">
    <![CDATA[<a href="https://www.linkinstars.com/post/12bae9d8.html">RSS 阅读体验可能不太好，若喜欢本文请点此跳转原文查看~</a><br><br>]]>
    <![CDATA[<p>原本这次去的目的主要是和老网友面基，顺便听听分享。听完之后，想到其中一些分享还是一些感悟可以记录下来。</p><h2 id="前言"><a href="#前言" class="headerlink" title="前言"></a>前言</h2><p>首先作为一个技术人，我听过很多的技术分享和大会，也都是那种非常干货的分享，源码分析、架构场景、技术原理等等，而这次 Meetup 显然不是技术人的盛会而更多的 AI 的一瞥。</p><h2 id="分享要点"><a href="#分享要点" class="headerlink" title="分享要点"></a>分享要点</h2><p>我记录了下面几点在分享中让我感受很深的部分：</p><ul><li><strong>一个 14 岁少年手挫 AI Co-Founder</strong>，第一个演讲的就是这个主题，先不说做的怎么样，真切的让我体会到了 AI 就是让编程直接 0 门槛，那么未来你的优势是什么？</li><li><strong>Cursor Engineer 连线提到，内部 100% Cursor</strong>，也就是 Cursor 自举，自己写自己。</li><li><strong>Cursor 比起直接生成 PPT 不如直接生成 HTML 来得更实在。</strong> 这个可以理解，毕竟我相信 PPT 的训练数据远不如代码的训练数据来的更多，而且现场的演讲者的 PPT 就是通过 AI 做的 HTML。</li><li><strong>如何描述一个我无法准确描述的内容？</strong> 当我们使用 AI 的时候，很多时候会遇到两个问题，一个是我们就不知道要做出什么样，还有一个就是我不知道我该如何表达。其实很简单，在前面多加一步，先让 AI 先拆解这个问题，让它向你询问并获取信息。</li><li><strong>AI 如何写脱口秀？</strong> 当 AI 遇到生活化的场景时，往往描述让人觉得很奇怪，因为它是没有感知的，于是写的脱口秀就不好笑。怎么解决呢？我们可以将这样的感知场景和数据给他让他上下文从而的输出更生活化。</li><li><strong>如果 Agent 足够强大，人做什么？</strong> 提出好的需求+培养审美+领导力。</li></ul><h2 id="我的思考"><a href="#我的思考" class="headerlink" title="我的思考"></a>我的思考</h2><h3 id="立刻-Coding？"><a href="#立刻-Coding？" class="headerlink" title="立刻 Coding？"></a>立刻 Coding？</h3><p>程序员做产品最容易犯的错误是，<strong>有了想法，立刻编码</strong>。而到了 AI 时代，编码的试错成本变得很低，就导致这个错误依旧在延续。快速做出 Demo 试错本身没有问题，但是我认为，更好的方式还是在给 AI 之前，先问问自己或问问 AI 你的需求是什么，准备如何去做，可能有哪些问题，而不是直接 Coding 然后再改。因为现在 AI 修改很容易造成屎山代码。</p><h3 id="文档记录好帮手"><a href="#文档记录好帮手" class="headerlink" title="文档记录好帮手"></a>文档记录好帮手</h3><p>AI 由于现在上下文的关系，很容易忘记项目之前出现过的一些问题和要求，而如果有一个 Project recording 去记录你做的重要更改和需求，那么每次只需要告诉 AI 你可以先查看之前的项目记录，遵守之前的约定和改动，这些记录就能更好的帮助它完成任务。</p><h3 id="根据效果重构提示词"><a href="#根据效果重构提示词" class="headerlink" title="根据效果重构提示词"></a>根据效果重构提示词</h3><p>如果你不知道如何写，或者写不好提示词，那么请先保存现在的项目情况，然后直接随便写，然后看效果，看完效果不对之后，不是立刻让 AI 修改，有时候可能需要直接先回滚代码，然后重构你的提示词，补充你认为前一次做的不对的地方，然后再看效果，这样会比有时候的直接修改来的更靠谱。</p><h3 id="你是喜欢写代码吗？"><a href="#你是喜欢写代码吗？" class="headerlink" title="你是喜欢写代码吗？"></a>你是喜欢写代码吗？</h3><p>在没有 AI 之前，你如果问我这个问题，我毫无疑问会说是。我当然是喜欢写代码的。而有了 AI 之后，<strong>我发现其实我最终并不是喜欢写代码，而是喜欢创造或利用编程去改变生活，无论是自己的生活还是别人的生活。</strong> 编码是我觉得能做出自己想要的东西，现在可以变成了我直接能让 AI 给我想要的东西了。所以或许我们是时候做出改变了。</p><h2 id="总结"><a href="#总结" class="headerlink" title="总结"></a>总结</h2><p>AI 让我们回到最开始的样子，让我们更专注于用户最终的需求本身，而不是无脑的技术堆砌。</p><p><img src="https://blog.linkinstars.com/blog/2025-cursor-meetup-hangzhou-0.jpg"><br><img src="https://blog.linkinstars.com/blog/2025-cursor-meetup-hangzhou-1.jpg"></p>]]>
    </content>
    
    
      
      
        
        
    <summary type="html">&lt;p&gt;原本这次去的目的主要是和老网友面基，顺便听听分享。听完之后，想到其中一些分享还是一些感悟可以记录下来。&lt;/p&gt;
&lt;h2 id=&quot;前言&quot;&gt;&lt;a href=&quot;#前言&quot; class=&quot;headerlink&quot;</summary>
        
      
    
    
    
    <category term="ai" scheme="https://www.linkinstars.com/categories/ai/"/>
    
    
    <category term="ai" scheme="https://www.linkinstars.com/tags/ai/"/>
    
  </entry>
  
  <entry>
    <title>浅谈结合 AI 开发应用：从技术狂热到价值落地的思考</title>
    <link href="https://www.linkinstars.com/post/de814b49.html"/>
    <id>https://www.linkinstars.com/post/de814b49.html</id>
    <published>2025-05-29T16:00:00.000Z</published>
    <updated>2025-06-13T11:36:48.457Z</updated>
    
    <content type="html">
    <![CDATA[<a href="https://www.linkinstars.com/post/de814b49.html">RSS 阅读体验可能不太好，若喜欢本文请点此跳转原文查看~</a><br><br>]]>
    <![CDATA[<h2 id="前言"><a href="#前言" class="headerlink" title="前言"></a>前言</h2><p>我平常会更喜欢写技术类型的博客，其实很少写思考类型的博客，这一次本来我是已经写了几篇博客：</p><ul><li><strong>结合 RAG 构建你的知识库</strong></li><li><strong>利用 transformers LoRA 微调模型</strong></li><li><strong>训练属于你自己的 AI 问答模型</strong></li></ul><p>写完这些之后，说实话，我有点意兴阑珊。不是因为技术不酷，相反，这些 AI 强大得令人兴奋。然后我发现这些内容好像少了点<strong>灵魂</strong>，而且网上的教程已经足够多了，特别是视频，一步步教你微调，<strong>比我的肯定要详细</strong>，于是思考了一下，重新构思才有了今天这篇文章。相比手把手教你敲代码、调参数，我更想聊聊这段时间<strong>在尝试“结合 AI 开发应用”过程中一些感悟</strong>。这更像是一场自我反思，也希望能引发一些同路人的共鸣。</p><h2 id="结合的方式"><a href="#结合的方式" class="headerlink" title="结合的方式"></a>结合的方式</h2><p>首先来说说结合 AI 开发应用的方式，实际中当然有很多，我总结分类一下为：</p><ul><li>直接利用 AI 对话能力</li><li>RAG</li><li>微调</li><li>MCP&#x2F;Agent</li></ul><p>所以我将从这几个分类聊聊尝试过后给我的感受。</p><h2 id="直接利用-AI-对话能力"><a href="#直接利用-AI-对话能力" class="headerlink" title="直接利用 AI 对话能力"></a>直接利用 AI 对话能力</h2><p>这是最直接、门槛最低的方式。开发者通过设计精心构造的提示词（Prompt），调用大型语言模型提供的对话&#x2F;补全 API，让模型根据输入生成文本输出。我试过的几个场景是：</p><ul><li>问答</li><li>摘要</li><li>翻译</li></ul><p>都很不错，接入方便好用，大家也都知道，而这里结合和能否成功的关键在于 <strong>产品化</strong>。因为这个门槛最低，你的产品交互能否让用户感觉丝滑特别重要。</p><blockquote><p>我的总结思考是：让普通人跳过和 AI 对话的步骤，直接得到想要的结果（<strong>路径缩短</strong>），这是这类产品的目标。</p></blockquote><p>当然它的缺点是不够专业（我叫作<strong>知识局限性</strong>），毕竟它得到的数据都是公开的，对于它从来不知道的事情来说，它也 无能为力。那么就需要下面的技术了。</p><h2 id="RAG"><a href="#RAG" class="headerlink" title="RAG"></a>RAG</h2><p>从你提供的个人数据库中检索出最相关的信息片段。然后将<strong>原始问题 + 检索到的相关上下文</strong>一起打包成一个新的 Prompt，发送给 AI 生成最终答案。</p><p>这里的检索往往都会采用向量的方式，之前我也写过向量数据库相关的文章，有的时候检索结果并不尽人意（不过我也没优化过）。</p><p>而其实尝试过 RAG 的同学应该都知道，这个方式最大的问题就在于数据的搜索出来的准确性：</p><ul><li>如果原始的问题在你的数据集里面本来就没有答案，凉凉</li><li>如果原始的问题在你的数据集里面有答案，但是你通过搜索的方式没找到，也凉了</li></ul><blockquote><p>我的总结和思考是：如果你本来就能通过搜索内容找到答案，那么借助 RAG 能更好的帮助你将答案组织为可读的逻辑给到用户；相反如果本身你的搜索能力就不及格，那就别指望它了。</p></blockquote><h2 id="微调"><a href="#微调" class="headerlink" title="微调"></a>微调</h2><p>通过技术微调已有的模型，从而建立专业领域的模型，来完成特定领域场景的任务。</p><p>其实就是在一个预训练好的大型语言模型（基模）上，使用<strong>你指定的数据集</strong>对模型的一部分或全部参数进行额外的训练。目的是让模型<strong>更擅长解决你的特定任务或适应你的特定领域语言风格</strong>。</p><p>微调给我的感受有两个：</p><ul><li>训练数据很重要，验证数据更重要。我往往找到了一大堆数据可以进行训练，但是如果仅仅只是训练，<strong>你是无法评估你微调前后的差别的</strong>。也就是说你无法评判一个微调模型的好坏，那你实际去用的时候其实是没底的。</li><li>没办法保证后续的训练不会对前面的训练有影响。</li></ul><p>而微调其实现在已经比你想得要容易了，已经有了很多开源的解决方案来帮助你微调模型，你几乎只需要改改参数，就可以了，有的微调甚至还有 UI 可以直接选择等等。甚至一些云厂商可以直接支持数据微调，并直接提供微调后的模型运行。</p><p>最重要的是什么？成本！训练成本(小)+运行成本(大)，是否能支撑合理的运行成本是你是否要采用微调的关键。</p><h2 id="MCP-Agent"><a href="#MCP-Agent" class="headerlink" title="MCP&#x2F;Agent"></a>MCP&#x2F;Agent</h2><p>通过外部服务能力，利用 AI 作为大脑，自动化原本的费时场景。MCP&#x2F;Agent 可以作为手，直接触及到用户使用的场景和行为，帮助你完成一些“力所能及的事”。</p><p>这其实是一个很不错的结合场景，因为大多数 AI 仅仅只是对话的话能提供的帮助太少，或者说自动化程度不高。用户都是懒的，被 AI 会越惯越懒，他们（我）甚至只想动动嘴皮子，就把事情做了。所以，如果 AI 仅仅告诉你怎么做，那么还不够，如果能把事情直接完成，那才是尽善尽美的服务。</p><p>整个过程可以被抽象为：</p><ul><li>规划：分解复杂目标为可执行的子任务。</li><li>工具使用：根据需求调用外部工具或服务(MCP)。</li><li>记忆：存储和利用历史信息。</li><li>反思：评估执行结果，修正错误或调整计划。。</li></ul><p>如果你用过各种 AI 的 IDE 你会发现他们的很多设计也是这几个要点。将你的请求先思考，拆分为一个个任务，然后通过本地的各种 tools 去编写修改代码，然后去验证一下写的对不对，不对再调整，如果整个上下文长了还会总结之前的工作交给下一个对话继续操作。</p><p>在这里，最重要的部分是协作。是 AI 与工具的协作。我的思考是：<strong>一定要清晰的告诉 AI 你的工具能什么</strong>，而对于 MCP 来说，提供的选择有时候越少越好。有时候如果你没有清楚的告诉 AI 你的工具的能力，很有可能会被误用，而 MCP 有的时候是因为你提供的工具或者选择太多，导致 AI 误用了很多没必要的工具去操作。</p><h2 id="浅谈与思考"><a href="#浅谈与思考" class="headerlink" title="浅谈与思考"></a>浅谈与思考</h2><h3 id="妄自菲薄"><a href="#妄自菲薄" class="headerlink" title="妄自菲薄"></a>妄自菲薄</h3><p>千万不要妄自菲薄，我经常听到的一句话是：“用户是不是不需要我们这个产品，因为用户自己把东西交给 AI 提问也能得到结果” 错，哪怕你仅仅只是优化了 Prompt 加上了交互，也会有用户付费。<strong>路径缩短</strong> 或者说节约用户行为时间，这样的体验就会产生依赖。</p><h3 id="数据为王"><a href="#数据为王" class="headerlink" title="数据为王"></a>数据为王</h3><p>RAG 依赖高质量的知识库，微调需要大量标注精良的数据。然而，现实世界中，企业的文档散乱陈旧，用户数据涉及隐私难以获取，标注成本高昂且质量参差不齐。<strong>数据工程的质量和规模，往往直接决定了 AI 应用的成败上限，而这恰恰是很多项目（尤其是个人或小团队项目）最难逾越的鸿沟。</strong> 没有好数据，再好的模型也是“巧妇难为无米之炊”。</p><p>我的思考是这样一个例子，我自认为人类的大脑比 AI 强大（在私有领域），换作是我，收到 RAG 最终给我的 Prompt 我都无法解决这个问题，那我凭什么期待 AI 能解决呢？对于这个领域(私有)可能它知道并不比我多啊。</p><blockquote><p>我都不知道公司打印机在哪里，AI 怎么会知道呢？</p></blockquote><h3 id="产品思维"><a href="#产品思维" class="headerlink" title="产品思维"></a>产品思维</h3><p>掌握一项 AI 技术（如 RAG、微调）让我们具备了“能做”的能力，这很重要，是基础。但决定一个 AI 应用能否成功、能否真正产生价值的，是<strong>我们能否精准地找到那个“值得做”的问题，并用产品思维、工程能力和对用户的理解，将技术转化为切实的解决方案。</strong> 这个过程，远比调通一个 API、跑通一个微调脚本要复杂和艰难得多。它要求我们从技术的狂热中冷静下来，更深入地走进用户的真实世界，更务实地看待技术的边界与成本，更精心地设计产品与体验。</p><p>技术是锤子，但用户需要解决的问题才是那颗钉子。顺序反了，再好的锤子也敲不准。</p><h3 id="MVP"><a href="#MVP" class="headerlink" title="MVP"></a>MVP</h3><p>最后是 <strong>最小可行产品 (MVP) + 快速迭代：</strong>，不要追求一上来就做一个完美的、功能繁多的应用。聚焦一个最核心用最快的速度（可能利用现成的 API、简单 RAG 或零样本&#x2F;少样本提示工程）做出一个粗糙但能用的 MVP。<strong>尽快让真实用户用起来，收集反馈，验证价值假设。</strong> 基于反馈决定是放弃、转向还是深入优化（这时才可能需要微调、复杂 RAG 等）。</p><h2 id="总结"><a href="#总结" class="headerlink" title="总结"></a>总结</h2><p>技术会发展，新的与 AI 结合的技术还会不断出现，而结合 AI 开发应用，这场盛宴才刚刚开始，而找到那把开启真正价值之门的钥匙，需要我们所有人持续地思考、实践和反思。</p>]]>
    </content>
    
    
      
      
        
        
    <summary type="html">&lt;h2 id=&quot;前言&quot;&gt;&lt;a href=&quot;#前言&quot; class=&quot;headerlink&quot; title=&quot;前言&quot;&gt;&lt;/a&gt;前言&lt;/h2&gt;&lt;p&gt;我平常会更喜欢写技术类型的博客，其实很少写思考类型的博客，这一次本来我是已经写了几篇博客：&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;strong&gt;结合</summary>
        
      
    
    
    
    <category term="ai" scheme="https://www.linkinstars.com/categories/ai/"/>
    
    
    <category term="ai" scheme="https://www.linkinstars.com/tags/ai/"/>
    
  </entry>
  
  <entry>
    <title>利用 transformers LoRA 微调模型</title>
    <link href="https://www.linkinstars.com/post/de3ff5b4.html"/>
    <id>https://www.linkinstars.com/post/de3ff5b4.html</id>
    <published>2025-05-14T16:00:00.000Z</published>
    <updated>2025-06-13T11:39:04.850Z</updated>
    
    <content type="html">
    <![CDATA[<a href="https://www.linkinstars.com/post/de3ff5b4.html">RSS 阅读体验可能不太好，若喜欢本文请点此跳转原文查看~</a><br><br>]]>
    <![CDATA[<h2 id="什么是-transformers"><a href="#什么是-transformers" class="headerlink" title="什么是 transformers"></a>什么是 transformers</h2><blockquote><p>🤗 Transformers 提供了便于快速下载和使用的 API，让你可以把预训练模型用在给定文本、在你的数据集上微调然后通过 model hub 与社区共享。同时，每个定义的 Python 模块都是完全独立的，便于修改和快速进行研究实验。</p></blockquote><p><a href="https://github.com/huggingface/transformers">https://github.com/huggingface/transformers</a></p><p>简单说就是利用这个库可以快速简单帮我们微调模型，很多东西都封装好了可以直接拿来用。</p><h2 id="准备"><a href="#准备" class="headerlink" title="准备"></a>准备</h2><ul><li>租一个服务器，我尝试的是 4090</li><li>选择一个支持 transformers 容器镜像环境</li><li>如果没有，也可以选择自己安装 <a href="https://github.com/huggingface/transformers/blob/main/i18n/README_zh-hans.md#%E5%AE%89%E8%A3%85">https://github.com/huggingface/transformers/blob/main/i18n/README_zh-hans.md#%E5%AE%89%E8%A3%85</a></li></ul><h2 id="步骤"><a href="#步骤" class="headerlink" title="步骤"></a>步骤</h2><p>下载基模，使用 <code>hf-mirror</code> 会快，或者镜像环境如果支持魔法可以直接魔法下</p><figure class="highlight bash"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br></pre></td><td class="code"><pre><span class="line">$ <span class="built_in">cd</span> /tmp</span><br><span class="line">$ git lfs install</span><br><span class="line">$ git <span class="built_in">clone</span> https://hf-mirror.com/Qwen/Qwen2.5-7B-Instruct</span><br><span class="line">$ <span class="built_in">cd</span> Qwen2.5-7B-Instruct</span><br></pre></td></tr></table></figure><h2 id="微调代码"><a href="#微调代码" class="headerlink" title="微调代码"></a>微调代码</h2><h3 id="数据集处理-data-preprocess-py"><a href="#数据集处理-data-preprocess-py" class="headerlink" title="数据集处理 data_preprocess.py"></a>数据集处理 data_preprocess.py</h3><p>这个部分需要根据你实际的具体数据格式进行处理，其中的关键就是读取已有的数据格式，将已有的数据格式转换为训练所需的数据集格式而已。</p><figure class="highlight python"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br><span class="line">29</span><br><span class="line">30</span><br><span class="line">31</span><br><span class="line">32</span><br><span class="line">33</span><br><span class="line">34</span><br><span class="line">35</span><br><span class="line">36</span><br><span class="line">37</span><br><span class="line">38</span><br><span class="line">39</span><br><span class="line">40</span><br><span class="line">41</span><br><span class="line">42</span><br><span class="line">43</span><br><span class="line">44</span><br><span class="line">45</span><br><span class="line">46</span><br><span class="line">47</span><br><span class="line">48</span><br><span class="line">49</span><br><span class="line">50</span><br><span class="line">51</span><br><span class="line">52</span><br><span class="line">53</span><br><span class="line">54</span><br><span class="line">55</span><br><span class="line">56</span><br><span class="line">57</span><br><span class="line">58</span><br><span class="line">59</span><br><span class="line">60</span><br><span class="line">61</span><br><span class="line">62</span><br><span class="line">63</span><br><span class="line">64</span><br><span class="line">65</span><br><span class="line">66</span><br><span class="line">67</span><br><span class="line">68</span><br><span class="line">69</span><br><span class="line">70</span><br><span class="line">71</span><br><span class="line">72</span><br><span class="line">73</span><br><span class="line">74</span><br><span class="line">75</span><br><span class="line">76</span><br><span class="line">77</span><br><span class="line">78</span><br><span class="line">79</span><br><span class="line">80</span><br><span class="line">81</span><br><span class="line">82</span><br><span class="line">83</span><br><span class="line">84</span><br><span class="line">85</span><br><span class="line">86</span><br><span class="line">87</span><br><span class="line">88</span><br><span class="line">89</span><br><span class="line">90</span><br><span class="line">91</span><br><span class="line">92</span><br><span class="line">93</span><br><span class="line">94</span><br><span class="line">95</span><br><span class="line">96</span><br><span class="line">97</span><br><span class="line">98</span><br><span class="line">99</span><br></pre></td><td class="code"><pre><span class="line"><span class="keyword">import</span> json</span><br><span class="line"><span class="keyword">from</span> torch.utils.data <span class="keyword">import</span> Dataset</span><br><span class="line"></span><br><span class="line"><span class="comment"># 定义一个名为 MyDataset 的新类，它继承自 Dataset 类</span></span><br><span class="line"><span class="keyword">class</span> <span class="title class_">MyDataset</span>(<span class="title class_ inherited__">Dataset</span>):</span><br><span class="line">    <span class="comment"># 定义类的初始化方法</span></span><br><span class="line">    <span class="keyword">def</span> <span class="title function_">__init__</span>(<span class="params">self, data, tokenizer, args</span>):</span><br><span class="line">        <span class="built_in">super</span>(MyDataset, <span class="variable language_">self</span>).__init__()</span><br><span class="line">        <span class="variable language_">self</span>.data = data</span><br><span class="line">        <span class="variable language_">self</span>.tokenizer = tokenizer</span><br><span class="line">        <span class="variable language_">self</span>.prompt_column = args.prompt_column</span><br><span class="line">        <span class="variable language_">self</span>.response_column = args.response_column</span><br><span class="line">        <span class="variable language_">self</span>.max_source_length = args.max_source_length</span><br><span class="line">        <span class="variable language_">self</span>.max_target_length = args.max_target_length</span><br><span class="line"></span><br><span class="line">    <span class="comment"># 返回数据集的长度</span></span><br><span class="line">    <span class="keyword">def</span> <span class="title function_">__len__</span>(<span class="params">self</span>):</span><br><span class="line">        <span class="keyword">return</span> <span class="built_in">len</span>(<span class="variable language_">self</span>.data)</span><br><span class="line"></span><br><span class="line">    <span class="comment"># 通过索引访问数据集中的样本</span></span><br><span class="line">    <span class="keyword">def</span> <span class="title function_">__getitem__</span>(<span class="params">self, i</span>):</span><br><span class="line">        item = <span class="variable language_">self</span>.data[i]</span><br><span class="line"></span><br><span class="line">        <span class="comment"># 获取消息列表</span></span><br><span class="line">        messages = item[<span class="string">&quot;messages&quot;</span>]</span><br><span class="line"></span><br><span class="line">        <span class="comment"># 分离系统消息、用户消息和助手消息</span></span><br><span class="line">        system_message = messages[<span class="number">0</span>]  <span class="comment"># 系统消息</span></span><br><span class="line">        user_message = messages[<span class="number">1</span>]    <span class="comment"># 用户消息</span></span><br><span class="line">        assistant_message = messages[<span class="number">2</span>]  <span class="comment"># 助手消息</span></span><br><span class="line"></span><br><span class="line">        <span class="comment"># 构建提示文本（系统消息 + 用户消息）</span></span><br><span class="line">        prompt = build_prompt(system_message, user_message)</span><br><span class="line"></span><br><span class="line">        <span class="comment"># 构建响应文本（助手消息）</span></span><br><span class="line">        response = build_response(assistant_message)</span><br><span class="line"></span><br><span class="line">        <span class="comment"># 对提示文本进行编码</span></span><br><span class="line">        context = <span class="variable language_">self</span>.tokenizer(</span><br><span class="line">            prompt,</span><br><span class="line">            max_length=<span class="variable language_">self</span>.max_source_length,</span><br><span class="line">            add_special_tokens=<span class="literal">False</span>)</span><br><span class="line"></span><br><span class="line">        <span class="comment"># 对响应文本进行编码</span></span><br><span class="line">        response_encoding = <span class="variable language_">self</span>.tokenizer(</span><br><span class="line">            response,</span><br><span class="line">            max_length=<span class="variable language_">self</span>.max_target_length,</span><br><span class="line">            add_special_tokens=<span class="literal">False</span>)</span><br><span class="line"></span><br><span class="line">        <span class="comment"># 将上下文和响应的 input_ids 连接起来</span></span><br><span class="line">        input_ids = context[<span class="string">&quot;input_ids&quot;</span>] + response_encoding[<span class="string">&quot;input_ids&quot;</span>]</span><br><span class="line"></span><br><span class="line">        <span class="comment"># 将上下文和响应的注意力掩码连接起来</span></span><br><span class="line">        attention_mask = context[<span class="string">&quot;attention_mask&quot;</span>] + response_encoding[<span class="string">&quot;attention_mask&quot;</span>]</span><br><span class="line"></span><br><span class="line">        <span class="comment"># 创建标签数组，标记上下文部分为 -100，响应部分使用真实的 input_ids</span></span><br><span class="line">        labels = [-<span class="number">100</span>] * <span class="built_in">len</span>(context[<span class="string">&quot;input_ids&quot;</span>]) + response_encoding[<span class="string">&quot;input_ids&quot;</span>]</span><br><span class="line"></span><br><span class="line">        <span class="comment"># 确保输入 ID 和标签的长度一致</span></span><br><span class="line">        <span class="keyword">assert</span> <span class="built_in">len</span>(input_ids) == <span class="built_in">len</span>(labels), <span class="string">f&quot;length mismatch: <span class="subst">&#123;<span class="built_in">len</span>(input_ids)&#125;</span> vs <span class="subst">&#123;<span class="built_in">len</span>(labels)&#125;</span>&quot;</span></span><br><span class="line"></span><br><span class="line">        <span class="comment"># 返回编码后的数据</span></span><br><span class="line">        <span class="keyword">return</span> &#123;</span><br><span class="line">            <span class="string">&quot;input_ids&quot;</span>: input_ids,</span><br><span class="line">            <span class="string">&quot;attention_mask&quot;</span>: attention_mask,</span><br><span class="line">            <span class="string">&quot;labels&quot;</span>: labels</span><br><span class="line">        &#125;</span><br><span class="line"></span><br><span class="line"><span class="comment"># 构建提示文本</span></span><br><span class="line"><span class="keyword">def</span> <span class="title function_">build_prompt</span>(<span class="params">system_message, user_message</span>):</span><br><span class="line">    <span class="comment"># 获取系统消息内容和工具定义</span></span><br><span class="line">    system_content = system_message[<span class="string">&quot;content&quot;</span>]</span><br><span class="line">    tools = system_message.get(<span class="string">&quot;tools&quot;</span>, [])</span><br><span class="line"></span><br><span class="line">    <span class="comment"># 获取用户消息内容</span></span><br><span class="line">    user_content = user_message[<span class="string">&quot;content&quot;</span>]</span><br><span class="line"></span><br><span class="line">    <span class="comment"># 构建提示文本</span></span><br><span class="line">    prompt = <span class="string">f&quot;&lt;|im_start|&gt;system\n<span class="subst">&#123;system_content&#125;</span>&quot;</span></span><br><span class="line"></span><br><span class="line">    <span class="comment"># 添加工具定义（如果有）</span></span><br><span class="line">    <span class="keyword">if</span> tools:</span><br><span class="line">        tool_str = json.dumps(tools, ensure_ascii=<span class="literal">False</span>)</span><br><span class="line">        prompt += <span class="string">f&quot;\n\nTools: <span class="subst">&#123;tool_str&#125;</span>&quot;</span></span><br><span class="line"></span><br><span class="line">    <span class="comment"># 添加用户消息</span></span><br><span class="line">    prompt += <span class="string">f&quot;\n&lt;|im_end|&gt;\n&lt;|im_start|&gt;user\n<span class="subst">&#123;user_content&#125;</span>\n&lt;|im_end|&gt;\n&lt;|im_start|&gt;assistant\n&quot;</span></span><br><span class="line"></span><br><span class="line">    <span class="keyword">return</span> prompt</span><br><span class="line"></span><br><span class="line"><span class="comment"># 构建响应文本</span></span><br><span class="line"><span class="keyword">def</span> <span class="title function_">build_response</span>(<span class="params">assistant_message</span>):</span><br><span class="line">    <span class="comment"># 获取助手消息内容</span></span><br><span class="line">    assistant_content = assistant_message[<span class="string">&quot;content&quot;</span>]</span><br><span class="line"></span><br><span class="line">    <span class="comment"># 构建响应文本</span></span><br><span class="line">    response = <span class="string">f&quot;<span class="subst">&#123;assistant_content&#125;</span>\n&lt;|im_end|&gt;&quot;</span></span><br><span class="line"></span><br><span class="line">    <span class="keyword">return</span> response</span><br></pre></td></tr></table></figure><p>看代码容易迷糊，直接看数据案例，首先是提示文本大概的样子，我们需要将原始的数据集转换为<strong>对话模型训练</strong>的格式，所以<strong>原始格式不重要</strong>，只要你最终能整成这样都可以。</p><figure class="highlight plaintext"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br></pre></td><td class="code"><pre><span class="line">&lt;|im_start|&gt;system</span><br><span class="line">You are a helpful assistant.</span><br><span class="line">&lt;|im_end|&gt;</span><br><span class="line">&lt;|im_start|&gt;user</span><br><span class="line">What is the capital of France?</span><br><span class="line">&lt;|im_end|&gt;</span><br><span class="line">&lt;|im_start|&gt;assistant</span><br><span class="line">The capital of France is Paris.</span><br><span class="line">&lt;|im_end|&gt;</span><br></pre></td></tr></table></figure><p>然后是最终 tokenizer 后的样子</p><figure class="highlight json"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br></pre></td><td class="code"><pre><span class="line"><span class="punctuation">&#123;</span></span><br><span class="line">  &#x27;input_ids&#x27;<span class="punctuation">:</span> tensor(<span class="punctuation">[</span> <span class="number">101</span><span class="punctuation">,</span> <span class="number">7429</span><span class="punctuation">,</span> <span class="number">2105</span><span class="punctuation">,</span> <span class="number">2019</span><span class="punctuation">,</span> <span class="number">17341</span><span class="punctuation">,</span>  <span class="number">999</span><span class="punctuation">,</span> <span class="number">2005</span><span class="punctuation">,</span> <span class="number">3185</span><span class="punctuation">,</span> <span class="number">4201</span><span class="punctuation">,</span> <span class="number">14926</span><span class="punctuation">,</span> <span class="number">2000</span><span class="punctuation">,</span> <span class="number">7976</span><span class="punctuation">,</span> <span class="number">22124</span><span class="punctuation">,</span>  <span class="number">102.</span>.....<span class="punctuation">]</span>)<span class="punctuation">,</span></span><br><span class="line">  &#x27;attention_mask&#x27;<span class="punctuation">:</span> tensor(<span class="punctuation">[</span><span class="number">1</span><span class="punctuation">,</span> <span class="number">1</span><span class="punctuation">,</span> <span class="number">1</span><span class="punctuation">,</span> <span class="number">1</span><span class="punctuation">,</span> <span class="number">1</span><span class="punctuation">,</span> <span class="number">1</span><span class="punctuation">,</span> <span class="number">1</span><span class="punctuation">,</span> <span class="number">1</span><span class="punctuation">,</span> <span class="number">1</span><span class="punctuation">,</span> <span class="number">1</span><span class="punctuation">,</span> <span class="number">1</span><span class="punctuation">,</span> <span class="number">1</span><span class="punctuation">,</span> <span class="number">1</span><span class="punctuation">,</span> <span class="number">1</span><span class="punctuation">,</span> ....<span class="punctuation">]</span>)<span class="punctuation">,</span></span><br><span class="line">  &#x27;labels&#x27;<span class="punctuation">:</span> tensor(<span class="number">1</span>)</span><br><span class="line"><span class="punctuation">&#125;</span></span><br></pre></td></tr></table></figure><h3 id="微调-finetune-py"><a href="#微调-finetune-py" class="headerlink" title="微调 finetune.py"></a>微调 finetune.py</h3><p>可以参考</p><ul><li><a href="https://github.com/sahil280114/codealpaca/blob/2f78ddc5c682ed6738ad092bbbfa59ba915afcb0/train.py#L183">https://github.com/sahil280114/codealpaca/blob/2f78ddc5c682ed6738ad092bbbfa59ba915afcb0/train.py#L183</a></li><li><a href="https://github.com/ophiraShen/I-AM/blob/1f3bccc7e84c385f6f2bceb564d0b730c5627f89/models/glm_4_9b_chat/evaluate.py">https://github.com/ophiraShen/I-AM/blob/1f3bccc7e84c385f6f2bceb564d0b730c5627f89/models/glm_4_9b_chat/evaluate.py</a></li></ul><p>网上还有很多，就不一一举例，直接让 GPT 写一个就好了，因为 transformers 都封装好了，其中最关键的还是在参数的设置上</p><figure class="highlight python"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br><span class="line">29</span><br><span class="line">30</span><br><span class="line">31</span><br><span class="line">32</span><br><span class="line">33</span><br><span class="line">34</span><br><span class="line">35</span><br><span class="line">36</span><br><span class="line">37</span><br><span class="line">38</span><br><span class="line">39</span><br><span class="line">40</span><br><span class="line">41</span><br><span class="line">42</span><br><span class="line">43</span><br><span class="line">44</span><br><span class="line">45</span><br><span class="line">46</span><br><span class="line">47</span><br><span class="line">48</span><br><span class="line">49</span><br><span class="line">50</span><br><span class="line">51</span><br><span class="line">52</span><br><span class="line">53</span><br><span class="line">54</span><br><span class="line">55</span><br><span class="line">56</span><br><span class="line">57</span><br><span class="line">58</span><br><span class="line">59</span><br><span class="line">60</span><br><span class="line">61</span><br><span class="line">62</span><br><span class="line">63</span><br><span class="line">64</span><br><span class="line">65</span><br><span class="line">66</span><br><span class="line">67</span><br><span class="line">68</span><br><span class="line">69</span><br><span class="line">70</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment"># 导入所需模块和库</span></span><br><span class="line"><span class="keyword">import</span> json</span><br><span class="line"><span class="keyword">import</span> torch</span><br><span class="line"><span class="keyword">from</span> transformers <span class="keyword">import</span> (</span><br><span class="line">    AutoTokenizer,</span><br><span class="line">    AutoModelForCausalLM,</span><br><span class="line">    BitsAndBytesConfig,</span><br><span class="line">    DataCollatorForSeq2Seq,</span><br><span class="line">    HfArgumentParser,</span><br><span class="line">    TrainingArguments,</span><br><span class="line">    Trainer</span><br><span class="line">)</span><br><span class="line"><span class="keyword">from</span> peft <span class="keyword">import</span> LoraConfig, TaskType, get_peft_model</span><br><span class="line"><span class="keyword">from</span> arguments <span class="keyword">import</span> ModelArguments, DataTrainingArguments, PeftArguments</span><br><span class="line"><span class="keyword">from</span> data_preprocess <span class="keyword">import</span> MyDataset</span><br><span class="line"></span><br><span class="line"><span class="keyword">def</span> <span class="title function_">main</span>():</span><br><span class="line">    <span class="comment"># 解析命令行参数</span></span><br><span class="line">    parser = HfArgumentParser((ModelArguments, DataTrainingArguments, PeftArguments, TrainingArguments))</span><br><span class="line">    model_args, data_args, peft_args, training_args = parser.parse_args_into_dataclasses()</span><br><span class="line"></span><br><span class="line">    <span class="comment"># 加载预训练模型和分词器</span></span><br><span class="line">    model = AutoModelForCausalLM.from_pretrained(model_args.model_name_or_path, torch_dtype=torch.bfloat16)</span><br><span class="line">    tokenizer = AutoTokenizer.from_pretrained(model_args.model_name_or_path)</span><br><span class="line"></span><br><span class="line">    <span class="comment"># 设置LoRA配置</span></span><br><span class="line">    lora_config = LoraConfig(</span><br><span class="line">        inference_mode=<span class="literal">False</span>,</span><br><span class="line">        task_type=TaskType.CAUSAL_LM,</span><br><span class="line">        target_modules=[<span class="string">&quot;q_proj&quot;</span>, <span class="string">&quot;k_proj&quot;</span>, <span class="string">&quot;v_proj&quot;</span>],</span><br><span class="line">        r=peft_args.lora_rank,</span><br><span class="line">        lora_alpha=peft_args.lora_alpha,</span><br><span class="line">        lora_dropout=peft_args.lora_dropout</span><br><span class="line">    )</span><br><span class="line">    <span class="comment"># 应用LoRA配置到模型</span></span><br><span class="line">    model = get_peft_model(model, lora_config).to(<span class="string">&quot;cuda&quot;</span>)</span><br><span class="line">    <span class="comment"># 输出可训练参数数量</span></span><br><span class="line">    model.print_trainable_parameters()</span><br><span class="line"></span><br><span class="line">    <span class="comment"># 设置数据规整器</span></span><br><span class="line">    data_collator = DataCollatorForSeq2Seq(</span><br><span class="line">        tokenizer=tokenizer,</span><br><span class="line">        padding=<span class="literal">True</span></span><br><span class="line">    )</span><br><span class="line"></span><br><span class="line">    <span class="comment"># 读取训练数据</span></span><br><span class="line">    <span class="keyword">if</span> training_args.do_train:</span><br><span class="line">        <span class="keyword">with</span> <span class="built_in">open</span>(data_args.train_file, <span class="string">&quot;r&quot;</span>, encoding=<span class="string">&quot;utf-8&quot;</span>) <span class="keyword">as</span> f:</span><br><span class="line">            train_data = [json.loads(line) <span class="keyword">for</span> line <span class="keyword">in</span> f]</span><br><span class="line">        <span class="comment"># Dataset 处理数据</span></span><br><span class="line">        train_dataset = Dataset(train_data, tokenizer, data_args)</span><br><span class="line"></span><br><span class="line">    <span class="comment"># 实例化Trainer</span></span><br><span class="line">    trainer = Trainer(</span><br><span class="line">        model=model,</span><br><span class="line">        tokenizer=tokenizer,</span><br><span class="line">        data_collator=data_collator,</span><br><span class="line">        args=training_args,</span><br><span class="line">        train_dataset=train_dataset <span class="keyword">if</span> training_args.do_train <span class="keyword">else</span> <span class="literal">None</span>,</span><br><span class="line">        eval_dataset=eval_dataset <span class="keyword">if</span> training_args.do_eval <span class="keyword">else</span> <span class="literal">None</span>,</span><br><span class="line">    )</span><br><span class="line"></span><br><span class="line">    <span class="comment"># 训练模型</span></span><br><span class="line">    <span class="keyword">if</span> training_args.do_train:</span><br><span class="line">        model.gradient_checkpointing_enable()</span><br><span class="line">        model.enable_input_require_grads()</span><br><span class="line">        trainer.train()</span><br><span class="line"></span><br><span class="line"><span class="keyword">if</span> __name__ == <span class="string">&quot;__main__&quot;</span>:</span><br><span class="line">    main()</span><br></pre></td></tr></table></figure><h3 id="启动脚本-train-sh"><a href="#启动脚本-train-sh" class="headerlink" title="启动脚本 train.sh"></a>启动脚本 train.sh</h3><figure class="highlight bash"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br><span class="line">29</span><br><span class="line">30</span><br><span class="line">31</span><br><span class="line">32</span><br><span class="line">33</span><br><span class="line">34</span><br></pre></td><td class="code"><pre><span class="line"><span class="meta">#! /usr/bin/env bash</span></span><br><span class="line"></span><br><span class="line"><span class="built_in">set</span> -ex</span><br><span class="line"></span><br><span class="line">LR=2e-4</span><br><span class="line"></span><br><span class="line">DATESTR=`<span class="built_in">date</span> +%Y%m%d-%H%M%S`</span><br><span class="line">RUN_NAME=ft_linkinstar_qwen2</span><br><span class="line">OUTPUT_DIR=output/<span class="variable">$&#123;RUN_NAME&#125;</span>-<span class="variable">$&#123;DATESTR&#125;</span></span><br><span class="line"><span class="built_in">mkdir</span> -p <span class="variable">$OUTPUT_DIR</span></span><br><span class="line"></span><br><span class="line"><span class="comment"># 修改你的模型路径</span></span><br><span class="line">MODEL_PATH=<span class="string">&quot;/tmp/Qwen2.5-7B-Instruct&quot;</span></span><br><span class="line"></span><br><span class="line">CUDA_VISIBLE_DEVICES=0 python finetune.py \</span><br><span class="line">    --do_train \</span><br><span class="line">    --train_file ../data/train.jsonl \</span><br><span class="line">    --model_name_or_path <span class="string">&quot;<span class="variable">$&#123;MODEL_PATH&#125;</span>&quot;</span> \</span><br><span class="line">    --output_dir <span class="variable">$OUTPUT_DIR</span> \</span><br><span class="line">    --max_source_length 2048 \</span><br><span class="line">    --max_target_length 1024 \</span><br><span class="line">    --per_device_train_batch_size 1 \</span><br><span class="line">    --per_device_eval_batch_size 1 \</span><br><span class="line">    --gradient_accumulation_steps 4 \</span><br><span class="line">    --evaluation_strategy steps \</span><br><span class="line">    --eval_steps 300 \</span><br><span class="line">    --num_train_epochs 3 \</span><br><span class="line">    --logging_steps 30 \</span><br><span class="line">    --logging_dir <span class="variable">$OUTPUT_DIR</span>/logs \</span><br><span class="line">    --save_steps 200 \</span><br><span class="line">    --learning_rate <span class="variable">$LR</span> \</span><br><span class="line">    --lora_rank 8 \</span><br><span class="line">    --lora_alpha 32 \</span><br><span class="line">    --lora_dropout 0.1 2&gt;&amp;1 | <span class="built_in">tee</span> <span class="variable">$&#123;OUTPUT_DIR&#125;</span>/train.log</span><br></pre></td></tr></table></figure><ul><li>其中如果你租的 GPU 多可以调整 <code>per_device_train_batch_size</code> 和 <code>per_device_eval_batch_size</code> 更大 2、4…</li><li>然后其中 LR 是学习率</li><li><code>response_column</code> 和 <code>prompt_column</code> 是用来指定字段用的，不用看，因为没用到，我们直接处理数据内容为对应格式就好</li><li><code>gradient_accumulation_steps</code> 如果 GPU 内存够 48+ 可以减少</li></ul><h3 id="使用"><a href="#使用" class="headerlink" title="使用"></a>使用</h3><figure class="highlight python"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br><span class="line">29</span><br><span class="line">30</span><br><span class="line">31</span><br><span class="line">32</span><br><span class="line">33</span><br></pre></td><td class="code"><pre><span class="line"><span class="keyword">import</span> argparse</span><br><span class="line"><span class="keyword">from</span> peft <span class="keyword">import</span> PeftModel</span><br><span class="line"><span class="keyword">from</span> transformers <span class="keyword">import</span> AutoTokenizer, AutoModelForCausalLM</span><br><span class="line"></span><br><span class="line"><span class="keyword">def</span> <span class="title function_">load_model</span>(<span class="params">model_path, checkpoint_path</span>):</span><br><span class="line">    tokenizer = AutoTokenizer.from_pretrained(model_path)</span><br><span class="line">    model = AutoModelForCausalLM.from_pretrained(model_path, torch_dtype=torch.bfloat16)</span><br><span class="line">    model = PeftModel.from_pretrained(model, model_id=checkpoint_path).to(<span class="string">&quot;cuda&quot;</span>).<span class="built_in">eval</span>()</span><br><span class="line">    <span class="keyword">return</span> tokenizer, model</span><br><span class="line"></span><br><span class="line"><span class="comment"># 初始化全局变量</span></span><br><span class="line">parser = argparse.ArgumentParser()</span><br><span class="line">parser.add_argument(<span class="string">&quot;--model&quot;</span>, <span class="built_in">type</span>=<span class="built_in">str</span>, default=<span class="literal">None</span>, required=<span class="literal">True</span>, <span class="built_in">help</span>=<span class="string">&quot;main model weights&quot;</span>)</span><br><span class="line">parser.add_argument(<span class="string">&quot;--ckpt&quot;</span>, <span class="built_in">type</span>=<span class="built_in">str</span>, default=<span class="literal">None</span>, required=<span class="literal">True</span>, <span class="built_in">help</span>=<span class="string">&quot;The checkpoint path&quot;</span>)</span><br><span class="line">args = parser.parse_args()</span><br><span class="line"></span><br><span class="line"><span class="comment"># 加载模型</span></span><br><span class="line">tokenizer, model = load_model(args.model, args.ckpt)</span><br><span class="line"></span><br><span class="line">parser = argparse.ArgumentParser()</span><br><span class="line">parser.add_argument(<span class="string">&quot;--model&quot;</span>, <span class="built_in">type</span>=<span class="built_in">str</span>, default=<span class="literal">None</span>, required=<span class="literal">True</span>, <span class="built_in">help</span>=<span class="string">&quot;main model weights&quot;</span>)</span><br><span class="line">parser.add_argument(<span class="string">&quot;--ckpt&quot;</span>, <span class="built_in">type</span>=<span class="built_in">str</span>, default=<span class="literal">None</span>, required=<span class="literal">True</span>, <span class="built_in">help</span>=<span class="string">&quot;The checkpoint path&quot;</span>)</span><br><span class="line">args = parser.parse_args()</span><br><span class="line"></span><br><span class="line">tokenizer, model = load_model(args.model, args.ckpt)</span><br><span class="line"></span><br><span class="line"><span class="comment"># 获取模型生成的回复</span></span><br><span class="line"><span class="keyword">def</span> <span class="title function_">get_completion</span>(<span class="params">prompt</span>):</span><br><span class="line">    inputs = tokenizer([prompt], return_tensors=<span class="string">&quot;pt&quot;</span>).to(<span class="string">&quot;cuda&quot;</span>)</span><br><span class="line">    <span class="keyword">with</span> torch.no_grad():</span><br><span class="line">        outputs = model.generate(**inputs, max_new_tokens=<span class="number">1024</span>)</span><br><span class="line">        response = tokenizer.decode(outputs[:,inputs[<span class="string">&#x27;input_ids&#x27;</span>].shape[<span class="number">1</span>]:][<span class="number">0</span>], skip_special_tokens=<span class="literal">True</span>)</span><br><span class="line">    <span class="keyword">return</span> response</span><br></pre></td></tr></table></figure><h2 id="总结"><a href="#总结" class="headerlink" title="总结"></a>总结</h2><p>经过这段时间的尝试和实践，总结如下几点：</p><ul><li>定向场景小模型够用</li><li>模型训练成本低，测试成本高</li><li>数据集非常重要</li><li>transformers 微调已经非常易用了</li></ul>]]>
    </content>
    
    
      
      
        
        
    <summary type="html">&lt;h2 id=&quot;什么是-transformers&quot;&gt;&lt;a href=&quot;#什么是-transformers&quot; class=&quot;headerlink&quot; title=&quot;什么是 transformers&quot;&gt;&lt;/a&gt;什么是</summary>
        
      
    
    
    
    <category term="ai" scheme="https://www.linkinstars.com/categories/ai/"/>
    
    
    <category term="transformers" scheme="https://www.linkinstars.com/tags/transformers/"/>
    
  </entry>
  
  <entry>
    <title>使用 Golang 快速体验 MCP 的魅力</title>
    <link href="https://www.linkinstars.com/post/b0b5ab4.html"/>
    <id>https://www.linkinstars.com/post/b0b5ab4.html</id>
    <published>2025-04-14T16:00:00.000Z</published>
    <updated>2025-06-13T06:57:21.474Z</updated>
    
    <content type="html">
    <![CDATA[<a href="https://www.linkinstars.com/post/b0b5ab4.html">RSS 阅读体验可能不太好，若喜欢本文请点此跳转原文查看~</a><br><br>]]>
    <![CDATA[<h2 id="前言"><a href="#前言" class="headerlink" title="前言"></a>前言</h2><blockquote><p>本文将带你使用 <a href="https://ollama.com/">ollama</a> + <a href="https://github.com/mark3labs/mcp-go/">mcp-go</a> + <a href="https://github.com/CherryHQ/cherry-studio">cherry-studio</a> 在本地实现并构建完整的一套流程。</p></blockquote><p>在 MCP 的风吹了那么久之后，相信你已经看过不少介绍了，如果你还没有实际体验过 MCP 的魅力，又或者你想通过 Golang 构建一个 MCP 服务，那么就来看看吧。</p><h2 id="准备"><a href="#准备" class="headerlink" title="准备"></a>准备</h2><ul><li>框架 <a href="https://github.com/mark3labs/mcp-go/">https://github.com/mark3labs/mcp-go/</a></li><li>对话客户端 <a href="https://github.com/CherryHQ/cherry-studio">https://github.com/CherryHQ/cherry-studio</a> 所有支持 MCP 的客户端都可以再比如 Claude Desktop</li><li>模型使用 ollama 跑一个 qwen2.5:7b 当然如果你本地无法运行也可以接入线上各个厂商提供的模型</li></ul><h2 id="解释"><a href="#解释" class="headerlink" title="解释"></a>解释</h2><blockquote><p>首先快速白话解释一下什么是 MCP ？</p></blockquote><h3 id="在没有-MCP-的时候"><a href="#在没有-MCP-的时候" class="headerlink" title="在没有 MCP 的时候"></a>在没有 MCP 的时候</h3><ul><li>用户：大模型，帮我点个外卖</li><li>大模型：好的，我将告诉你，点外卖的步骤是….但你得自己点，因为我没有“手”</li></ul><h3 id="当有了-MCP-之后"><a href="#当有了-MCP-之后" class="headerlink" title="当有了 MCP 之后"></a>当有了 MCP 之后</h3><ul><li>用户：大模型，帮我点个外卖</li><li>大模型：好的，鳄了 MCP 服务去帮我下单一杯咖啡</li><li>鳄了 MCP 服务：收到，咖啡订单已提交</li></ul><p>所以，其实你不用了解太多的细节，也能知道 MCP 是做什么用的。关键就是，大模型本身仅提供了对话的能力，而想要实际操作一些东西的时候，它无法触及，而此时外部系统通常的做法是提供一些 API 接口，让其他系统能够调用自己，从而提供服务。而 API 的问题在于每家都有自己的参数列表和返回结果，MCP 的优势是在于制定了协议统一了接入的规范，从而让各个系统都能轻松的被大模型调用。<strong>从宏观的角度看，就是给大模型装上了 “手” 让他能触及实际的业务场景。</strong></p><h2 id="目标"><a href="#目标" class="headerlink" title="目标"></a>目标</h2><p>我网上看了很多有关 MCP 实现案例的文章，发现一个共同的问题，大部分都在提供了一个工具服务为加法，a+b &#x3D; c 这样。但是这完全体现不出 MCP 的意义（我的大模型自己不会算吗？非得你 MCP 教我？[狗头]）。所以我们这次的目标是让大模型可以直接操作你本地的文件系统，从而体会到 MCP 的魅力。</p><h2 id="直接上代码"><a href="#直接上代码" class="headerlink" title="直接上代码"></a>直接上代码</h2><p>不搞哪些花里胡哨的东西，直接上代码，而且非常简单，一看就懂。</p><h3 id="tools-go"><a href="#tools-go" class="headerlink" title="tools.go"></a>tools.go</h3><p>首先定义两个方法，用于操作本地的目录文件</p><ul><li>listFiles 用于查询目录下的所有文件</li><li>renameFile 由于重命名一个文件<br><code>directoryPath</code> 设置了仅允许操作的文件目录，防止大模型看到一些不该看的小视频</li></ul><figure class="highlight go"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br><span class="line">29</span><br><span class="line">30</span><br><span class="line">31</span><br><span class="line">32</span><br><span class="line">33</span><br><span class="line">34</span><br><span class="line">35</span><br><span class="line">36</span><br><span class="line">37</span><br><span class="line">38</span><br><span class="line">39</span><br><span class="line">40</span><br><span class="line">41</span><br><span class="line">42</span><br><span class="line">43</span><br><span class="line">44</span><br><span class="line">45</span><br><span class="line">46</span><br><span class="line">47</span><br><span class="line">48</span><br><span class="line">49</span><br><span class="line">50</span><br><span class="line">51</span><br><span class="line">52</span><br></pre></td><td class="code"><pre><span class="line"><span class="keyword">package</span> main</span><br><span class="line"></span><br><span class="line"><span class="keyword">import</span> (</span><br><span class="line"><span class="string">&quot;errors&quot;</span></span><br><span class="line"><span class="string">&quot;fmt&quot;</span></span><br><span class="line"><span class="string">&quot;os&quot;</span></span><br><span class="line"><span class="string">&quot;path/filepath&quot;</span></span><br><span class="line"><span class="string">&quot;strings&quot;</span></span><br><span class="line">)</span><br><span class="line"></span><br><span class="line"><span class="keyword">const</span> (</span><br><span class="line">directoryPath = <span class="string">&quot;/tmp/linkinstar&quot;</span></span><br><span class="line">)</span><br><span class="line"></span><br><span class="line"><span class="function"><span class="keyword">func</span> <span class="title">listFiles</span><span class="params">(directory <span class="type">string</span>, extension <span class="type">string</span>)</span></span> ([]<span class="type">string</span>, <span class="type">error</span>) &#123;</span><br><span class="line"><span class="keyword">if</span> strings.TrimSpace(directory) != directoryPath &#123;</span><br><span class="line"><span class="keyword">return</span> <span class="literal">nil</span>, errors.New(fmt.Sprintf(<span class="string">&quot;我无法访问目录 %s&quot;</span>, directory))</span><br><span class="line">&#125;</span><br><span class="line"></span><br><span class="line">fmt.Println(<span class="string">&quot;Listing files in directory :&quot;</span>, directory)</span><br><span class="line"></span><br><span class="line">files, err := os.ReadDir(directory)</span><br><span class="line"><span class="keyword">if</span> err != <span class="literal">nil</span> &#123;</span><br><span class="line"><span class="keyword">return</span> <span class="literal">nil</span>, err</span><br><span class="line">&#125;</span><br><span class="line"></span><br><span class="line"><span class="keyword">var</span> result []<span class="type">string</span></span><br><span class="line"><span class="keyword">for</span> _, file := <span class="keyword">range</span> files &#123;</span><br><span class="line"><span class="keyword">if</span> file.IsDir() &#123;</span><br><span class="line"><span class="keyword">continue</span></span><br><span class="line">&#125;</span><br><span class="line"><span class="keyword">if</span> extension == <span class="string">&quot;&quot;</span> || strings.HasSuffix(file.Name(), extension) &#123;</span><br><span class="line">result = <span class="built_in">append</span>(result, file.Name())</span><br><span class="line">&#125;</span><br><span class="line">&#125;</span><br><span class="line"></span><br><span class="line"><span class="keyword">return</span> result, <span class="literal">nil</span></span><br><span class="line">&#125;</span><br><span class="line"></span><br><span class="line"><span class="function"><span class="keyword">func</span> <span class="title">renameFile</span><span class="params">(directory <span class="type">string</span>, oldName <span class="type">string</span>, newName <span class="type">string</span>)</span></span> ([]<span class="type">string</span>, <span class="type">error</span>) &#123;</span><br><span class="line"><span class="keyword">if</span> strings.TrimSpace(directory) != directoryPath &#123;</span><br><span class="line"><span class="keyword">return</span> <span class="literal">nil</span>, errors.New(fmt.Sprintf(<span class="string">&quot;我无法访问目录 %s&quot;</span>, directory))</span><br><span class="line">&#125;</span><br><span class="line"></span><br><span class="line">fmt.Println(<span class="string">&quot;Renaming file:&quot;</span>, oldName, <span class="string">&quot;to&quot;</span>, newName)</span><br><span class="line"></span><br><span class="line">err := os.Rename(filepath.Join(directory, oldName), filepath.Join(directory, newName))</span><br><span class="line"><span class="keyword">if</span> err != <span class="literal">nil</span> &#123;</span><br><span class="line"><span class="keyword">return</span> <span class="literal">nil</span>, err</span><br><span class="line">&#125;</span><br><span class="line"><span class="keyword">return</span> listFiles(directory, <span class="string">&quot;&quot;</span>)</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure><h3 id="main-go"><a href="#main-go" class="headerlink" title="main.go"></a>main.go</h3><p>将两个 tools 注册并添加到服务中，最后启动了一个 SSE 的服务。注册时可以看到我们指定了输入的必要参数。具体这里就不过多解释 SSE 是什么了，当然 MCP 也提供了其他接入的方式，比如标准的输入输出等等，这里以 SSE 举例。<code>AddTool</code> 内的实现方法也非常简单，就是获取参数，调用前一步的 方法 ，然后处理并返回结果即可。</p><figure class="highlight go"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br><span class="line">29</span><br><span class="line">30</span><br><span class="line">31</span><br><span class="line">32</span><br><span class="line">33</span><br><span class="line">34</span><br><span class="line">35</span><br><span class="line">36</span><br><span class="line">37</span><br><span class="line">38</span><br><span class="line">39</span><br><span class="line">40</span><br><span class="line">41</span><br><span class="line">42</span><br><span class="line">43</span><br><span class="line">44</span><br><span class="line">45</span><br><span class="line">46</span><br><span class="line">47</span><br><span class="line">48</span><br><span class="line">49</span><br><span class="line">50</span><br><span class="line">51</span><br><span class="line">52</span><br><span class="line">53</span><br><span class="line">54</span><br><span class="line">55</span><br><span class="line">56</span><br><span class="line">57</span><br><span class="line">58</span><br><span class="line">59</span><br><span class="line">60</span><br><span class="line">61</span><br><span class="line">62</span><br><span class="line">63</span><br><span class="line">64</span><br><span class="line">65</span><br></pre></td><td class="code"><pre><span class="line"><span class="keyword">package</span> main</span><br><span class="line"></span><br><span class="line"><span class="keyword">import</span> (</span><br><span class="line"><span class="string">&quot;context&quot;</span></span><br><span class="line"></span><br><span class="line"><span class="string">&quot;github.com/mark3labs/mcp-go/mcp&quot;</span></span><br><span class="line"><span class="string">&quot;github.com/mark3labs/mcp-go/server&quot;</span></span><br><span class="line">)</span><br><span class="line"></span><br><span class="line"><span class="function"><span class="keyword">func</span> <span class="title">main</span><span class="params">()</span></span> &#123;</span><br><span class="line">s := server.NewMCPServer(<span class="string">&quot;File manager server&quot;</span>, <span class="string">&quot;1.0.0&quot;</span>)</span><br><span class="line"></span><br><span class="line"><span class="comment">// 添加工具，查询 /tmp 下的目录</span></span><br><span class="line">listFilesTool := mcp.NewTool(<span class="string">&quot;list_files&quot;</span>, mcp.WithDescription(<span class="string">&quot;列出指定目录下的文件&quot;</span>),</span><br><span class="line">mcp.WithString(<span class="string">&quot;directory&quot;</span>, mcp.Required(), mcp.Description(<span class="string">&quot;要列出文件的目录&quot;</span>)),</span><br><span class="line">mcp.WithString(<span class="string">&quot;extension&quot;</span>, mcp.Description(<span class="string">&quot;要过滤的文件扩展名&quot;</span>)),</span><br><span class="line">)</span><br><span class="line">s.AddTool(listFilesTool, <span class="function"><span class="keyword">func</span><span class="params">(ctx context.Context, request mcp.CallToolRequest)</span></span> (*mcp.CallToolResult, <span class="type">error</span>) &#123;</span><br><span class="line">directory := request.Params.Arguments[<span class="string">&quot;directory&quot;</span>].(<span class="type">string</span>)</span><br><span class="line">extension := request.Params.Arguments[<span class="string">&quot;extension&quot;</span>].(<span class="type">string</span>)</span><br><span class="line"></span><br><span class="line">files, err := listFiles(directory, extension)</span><br><span class="line"><span class="keyword">if</span> err != <span class="literal">nil</span> &#123;</span><br><span class="line"><span class="keyword">return</span> mcp.NewToolResultText(err.Error()), <span class="literal">nil</span></span><br><span class="line">&#125;</span><br><span class="line"></span><br><span class="line">res := <span class="string">&quot;文件列表:\n&quot;</span></span><br><span class="line"><span class="keyword">for</span> _, file := <span class="keyword">range</span> files &#123;</span><br><span class="line">res += file + <span class="string">&quot;\n&quot;</span></span><br><span class="line">&#125;</span><br><span class="line"></span><br><span class="line"><span class="keyword">return</span> mcp.NewToolResultText(res), <span class="literal">nil</span></span><br><span class="line">&#125;)</span><br><span class="line"></span><br><span class="line"><span class="comment">// 添加工具，重命名文件</span></span><br><span class="line">renameFileTool := mcp.NewTool(<span class="string">&quot;rename_file&quot;</span>, mcp.WithDescription(<span class="string">&quot;重命名文件&quot;</span>),</span><br><span class="line">mcp.WithString(<span class="string">&quot;directory&quot;</span>, mcp.Required(), mcp.Description(<span class="string">&quot;要重命名文件的目录&quot;</span>)),</span><br><span class="line">mcp.WithString(<span class="string">&quot;old_name&quot;</span>, mcp.Required(), mcp.Description(<span class="string">&quot;要重命名的旧文件名&quot;</span>)),</span><br><span class="line">mcp.WithString(<span class="string">&quot;new_name&quot;</span>, mcp.Required(), mcp.Description(<span class="string">&quot;新的文件名&quot;</span>)),</span><br><span class="line">)</span><br><span class="line">s.AddTool(renameFileTool, <span class="function"><span class="keyword">func</span><span class="params">(ctx context.Context, request mcp.CallToolRequest)</span></span> (*mcp.CallToolResult, <span class="type">error</span>) &#123;</span><br><span class="line">directory := request.Params.Arguments[<span class="string">&quot;directory&quot;</span>].(<span class="type">string</span>)</span><br><span class="line">oldName := request.Params.Arguments[<span class="string">&quot;old_name&quot;</span>].(<span class="type">string</span>)</span><br><span class="line">newName := request.Params.Arguments[<span class="string">&quot;new_name&quot;</span>].(<span class="type">string</span>)</span><br><span class="line"></span><br><span class="line">files, err := renameFile(directory, oldName, newName)</span><br><span class="line"><span class="keyword">if</span> err != <span class="literal">nil</span> &#123;</span><br><span class="line"><span class="keyword">return</span> <span class="literal">nil</span>, err</span><br><span class="line">&#125;</span><br><span class="line"></span><br><span class="line">res := <span class="string">&quot;重命名后的文件列表:\n&quot;</span></span><br><span class="line"><span class="keyword">for</span> _, file := <span class="keyword">range</span> files &#123;</span><br><span class="line">res += file + <span class="string">&quot;\n&quot;</span></span><br><span class="line">&#125;</span><br><span class="line"></span><br><span class="line"><span class="keyword">return</span> mcp.NewToolResultText(res), <span class="literal">nil</span></span><br><span class="line">&#125;)</span><br><span class="line"></span><br><span class="line">err := server.NewSSEServer(s).Start(<span class="string">&quot;:9999&quot;</span>)</span><br><span class="line"><span class="keyword">if</span> err != <span class="literal">nil</span> &#123;</span><br><span class="line"><span class="built_in">panic</span>(err)</span><br><span class="line">&#125;</span><br><span class="line"><span class="keyword">return</span></span><br><span class="line">&#125;</span><br><span class="line"></span><br></pre></td></tr></table></figure><h2 id="测试"><a href="#测试" class="headerlink" title="测试"></a>测试</h2><blockquote><p>测试前请保证大模型本身支持 MCP 并正常运行，我使用的是：ollama run qwen2.5:7b</p></blockquote><h3 id="配置-MCP-服务"><a href="#配置-MCP-服务" class="headerlink" title="配置 MCP 服务"></a>配置 MCP 服务</h3><p>配置非常简单，只需要配置一个 <code>http://127.0.0.1:9999/sse</code> 地址就可以了</p><p><img src="https://blog.linkinstars.com/blog/mcp-go-try-set-mcp-server-config.png" alt="mcp-go-try-set-mcp-server-config.png"></p><p>记得需要在对话前启用指定的 MCP 服务哦</p><p><img src="https://blog.linkinstars.com/blog/mcp-go-try-set-mcp-server.png" alt="mcp-go-try-set-mcp-server.png"></p><h3 id="对话测试"><a href="#对话测试" class="headerlink" title="对话测试"></a>对话测试</h3><p>如果你可以看到在对话中客户端主动调用了你的 MCP 服务证明成功了</p><p><img src="https://blog.linkinstars.com/blog/mcp-go-try-set-chat2.png" alt="mcp-go-try-set-chat2.png"></p><p>可以看到，大模型可以理解我们的要求，并调用对应所需要的 MCP 服务从而实现对应的操作。现在大模型的手已经可以伸到我们本地来咯。</p><h2 id="扩展与总结"><a href="#扩展与总结" class="headerlink" title="扩展与总结"></a>扩展与总结</h2><h3 id="扩展"><a href="#扩展" class="headerlink" title="扩展"></a>扩展</h3><p>除了我们上面案例中提到通过 <code>AddTool</code> 方法告诉大模型你提供了哪一些工具，另外还有 <code>AddResource</code> <code>AddPrompt</code> 提供可访问的资源以及最佳实践的一些提示词等。</p><p>在上面的案例中我们只是简单的列表和重命名了本地的文件，你可以进一步扩展，比如制作一个本地的文件自动管理工具，自动将杂乱无序的文件以一种合理的顺序归类并整理好。</p><h3 id="总结"><a href="#总结" class="headerlink" title="总结"></a>总结</h3><p>就像前面提到的那样，MCP 就像是给大模型装上了 “手” ，<strong>大脑（大模型）负责处理我们说的指令，将指令拆分成各个动作，然后调用各个协调系统（MCP）最终完成这个指令的工作</strong>。相信你看完本文应该不仅能快速上手 MCP 的使用，还能体会到 MCP 的魅力所在。那么赶紧试试吧，去构建你自己的 MCP 服务吧。</p>]]>
    </content>
    
    
      
      
        
        
    <summary type="html">&lt;h2 id=&quot;前言&quot;&gt;&lt;a href=&quot;#前言&quot; class=&quot;headerlink&quot; title=&quot;前言&quot;&gt;&lt;/a&gt;前言&lt;/h2&gt;&lt;blockquote&gt;
&lt;p&gt;本文将带你使用 &lt;a href=&quot;https://ollama.com/&quot;&gt;ollama&lt;/a&gt; + &lt;a</summary>
        
      
    
    
    
    <category term="ai" scheme="https://www.linkinstars.com/categories/ai/"/>
    
    
    <category term="mcp" scheme="https://www.linkinstars.com/tags/mcp/"/>
    
  </entry>
  
  <entry>
    <title>《一起读 kubernetes 源码》交通指挥员 Core-DNS</title>
    <link href="https://www.linkinstars.com/post/6b16626e.html"/>
    <id>https://www.linkinstars.com/post/6b16626e.html</id>
    <published>2025-03-29T16:00:00.000Z</published>
    <updated>2025-06-13T06:34:01.974Z</updated>
    
    <content type="html">
    <![CDATA[<a href="https://www.linkinstars.com/post/6b16626e.html">RSS 阅读体验可能不太好，若喜欢本文请点此跳转原文查看~</a><br><br>]]>
    <![CDATA[<blockquote><p>📢 注意，该文本非最终版本，正在更新中，版权所有，请勿转载！！</p></blockquote><h2 id="前言"><a href="#前言" class="headerlink" title="前言"></a>前言</h2><p>在 k8s 中 pod 之间的访问在所难免，多个系统之间往往有着调用和互动。那么他们之间的访问就离不开 DNS 的帮助，两个相同 namespace 下的 pod 可以直接通过名称访问，而不同 namespace 下也只需要加上 namespace 就可以了。而实现的关键就是我们的 “交通指挥员” Core-DNS。</p><h2 id="前置知识"><a href="#前置知识" class="headerlink" title="前置知识"></a>前置知识</h2><ul><li><a href="https://coredns.io/manual/toc/">Core-DNS 的基本功能</a></li><li>简单的了解 <a href="https://github.com/caddyserver/caddy">caddy</a></li></ul><h2 id="心路历程"><a href="#心路历程" class="headerlink" title="心路历程"></a>心路历程</h2><p>第一次知道它是哪个时候还是 kube-dns，后来才是 Core-DNS。因为在整个系统安装好之后，就会有一个 pod 名字带 core-dns 的 pod。我当时就好奇这个 pod 是干什么的，于是就去研究，后来才知道它是负责 DNS 的。今天我们就来一起读读 Core-DNS 的源码，看看它是如何工作的。</p><h2 id="码前提问"><a href="#码前提问" class="headerlink" title="码前提问"></a>码前提问</h2><p>在看实际的源码之前，首先我们要明白 Core-DNS 的基本架构和工作原理。Core-DNS 本质是一个灵活的 DNS 服务器，我们知道 DNS 服务器的主要功能是解析域名，将域名转换为 IP 地址。而 Core-DNS 解析的域名不是普通的域名，而是 Kubernetes 集群中的服务和 Pod 的域名。而 IP 地址则是 Pod 的 IP 地址。从而让 pod 之间访问的时候不要用记不住的 IP 地址，而且 IP 地址还会变动。那么其实我们最关心的就是 Core-DNS 是如何知道 Pod 的 IP 地址的，以及它是如何处理 DNS 查询的。</p><ol><li><code>Core-DNS</code> 是如何知道 pod 的 IP 地址的？</li><li><code>Core-DNS</code> 是如何处理 DNS 查询的？</li><li><code>Core-DNS</code> 有什么优化措施来提高性能？</li></ol><h2 id="源码分析"><a href="#源码分析" class="headerlink" title="源码分析"></a>源码分析</h2><p>首先 Core-DNS 在 <a href="https://github.com/coredns/coredns">https://github.com/coredns/coredns</a></p><blockquote><p>由于 Core-DNS 其实是利用了老版本的 caddy 作为一个启动器，而所有必要的功能都是以插件的形式存在的，所以看源码的时候可能会跳出这个仓库本身。并且跳出去之后可能通过查看引用的方式是没有办法直接跳回来的，所以请注意跳转的时候记住来源。</p></blockquote><h3 id="入口"><a href="#入口" class="headerlink" title="入口"></a>入口</h3><p>入口其实特别简单，就在最外面的文件中，引入插件，然后调用 <code>coremain.Run()</code> 启动 Core-DNS。</p><figure class="highlight go"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment">// coredns.go</span></span><br><span class="line"><span class="keyword">import</span> (</span><br><span class="line">    _ <span class="string">&quot;github.com/coredns/coredns/core/plugin&quot;</span> <span class="comment">// Plug in CoreDNS.</span></span><br><span class="line">    <span class="string">&quot;github.com/coredns/coredns/coremain&quot;</span></span><br><span class="line">)</span><br><span class="line"></span><br><span class="line"><span class="function"><span class="keyword">func</span> <span class="title">main</span><span class="params">()</span></span> &#123;</span><br><span class="line">    coremain.Run()</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure><p>然后你点进去 <code>coremain.Run()</code> 就会发现懵了，因为其本质是在启动一个 caddy 实例。而并没有任何调用 Core-DNS 的代码。怎么回事呢？</p><figure class="highlight go"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br><span class="line">29</span><br><span class="line">30</span><br><span class="line">31</span><br><span class="line">32</span><br><span class="line">33</span><br><span class="line">34</span><br><span class="line">35</span><br><span class="line">36</span><br><span class="line">37</span><br><span class="line">38</span><br><span class="line">39</span><br><span class="line">40</span><br><span class="line">41</span><br><span class="line">42</span><br><span class="line">43</span><br><span class="line">44</span><br><span class="line">45</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment">// Run is CoreDNS&#x27;s main() function.</span></span><br><span class="line"><span class="function"><span class="keyword">func</span> <span class="title">Run</span><span class="params">()</span></span> &#123;</span><br><span class="line">    caddy.TrapSignals()</span><br><span class="line">    flag.Parse()</span><br><span class="line"></span><br><span class="line">    <span class="keyword">if</span> <span class="built_in">len</span>(flag.Args()) &gt; <span class="number">0</span> &#123;</span><br><span class="line">        mustLogFatal(fmt.Errorf(<span class="string">&quot;extra command line arguments: %s&quot;</span>, flag.Args()))</span><br><span class="line">    &#125;</span><br><span class="line"></span><br><span class="line">    log.SetOutput(os.Stdout)</span><br><span class="line">    log.SetFlags(LogFlags)</span><br><span class="line"></span><br><span class="line">    <span class="keyword">if</span> version &#123;</span><br><span class="line">        showVersion()</span><br><span class="line">        os.Exit(<span class="number">0</span>)</span><br><span class="line">    &#125;</span><br><span class="line">    <span class="keyword">if</span> plugins &#123;</span><br><span class="line">        fmt.Println(caddy.DescribePlugins())</span><br><span class="line">        os.Exit(<span class="number">0</span>)</span><br><span class="line">    &#125;</span><br><span class="line"></span><br><span class="line">    _, err := maxprocs.Set(maxprocs.Logger(log.Printf))</span><br><span class="line">    <span class="keyword">if</span> err != <span class="literal">nil</span> &#123;</span><br><span class="line">        log.Println(<span class="string">&quot;[WARNING] Failed to set GOMAXPROCS:&quot;</span>, err)</span><br><span class="line">    &#125;</span><br><span class="line"></span><br><span class="line">    <span class="comment">// Get Corefile input</span></span><br><span class="line">    corefile, err := caddy.LoadCaddyfile(serverType)</span><br><span class="line">    <span class="keyword">if</span> err != <span class="literal">nil</span> &#123;</span><br><span class="line">        mustLogFatal(err)</span><br><span class="line">    &#125;</span><br><span class="line"></span><br><span class="line">    <span class="comment">// Start your engines</span></span><br><span class="line">    instance, err := caddy.Start(corefile)</span><br><span class="line">    <span class="keyword">if</span> err != <span class="literal">nil</span> &#123;</span><br><span class="line">        mustLogFatal(err)</span><br><span class="line">    &#125;</span><br><span class="line"></span><br><span class="line">    <span class="keyword">if</span> !dnsserver.Quiet &#123;</span><br><span class="line">        showVersion()</span><br><span class="line">    &#125;</span><br><span class="line"></span><br><span class="line">    <span class="comment">// Twiddle your thumbs</span></span><br><span class="line">    instance.Wait()</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure><p>此时，不要慌，我们可以尝试去 <code>caddy.Start</code> 里面看看，就会看到其实调用关系是 <code>caddy.Start</code> -&gt; <code>startWithListenerFds</code> -&gt; <code>inst.context.MakeServers</code> 。而其中的 context 是一个接口</p><figure class="highlight go"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br></pre></td><td class="code"><pre><span class="line"><span class="keyword">type</span> Context <span class="keyword">interface</span> &#123;</span><br><span class="line">    InspectServerBlocks(<span class="type">string</span>, []caddyfile.ServerBlock) ([]caddyfile.ServerBlock, <span class="type">error</span>)</span><br><span class="line"></span><br><span class="line">    <span class="comment">// This is what Caddy calls to make server instances.</span></span><br><span class="line">    <span class="comment">// By this time, all directives have been executed and,</span></span><br><span class="line">    <span class="comment">// presumably, the context has enough state to produce</span></span><br><span class="line">    <span class="comment">// server instances for Caddy to start.</span></span><br><span class="line">    MakeServers() ([]Server, <span class="type">error</span>)</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure><p>而在我们的 <code>/core/dnsserver/register.go</code> 有一个 <code>dnsContext</code> 实现了这个接口。</p><figure class="highlight go"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment">// core/dnsserver/register.go:37</span></span><br><span class="line"><span class="keyword">type</span> dnsContext <span class="keyword">struct</span> &#123;</span><br><span class="line">    keysToConfigs <span class="keyword">map</span>[<span class="type">string</span>]*Config</span><br><span class="line"></span><br><span class="line">    <span class="comment">// configs is the master list of all site configs.</span></span><br><span class="line">    configs []*Config</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure><p>而且其中的 <code>MakeServers</code> 方法就是我们需要的。它会调用 <code>makeServersForGroup</code> 方法来创建服务器实例。如下：</p><figure class="highlight go"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment">// core/dnsserver/register.go:301</span></span><br><span class="line"><span class="function"><span class="keyword">func</span> <span class="title">makeServersForGroup</span><span class="params">(addr <span class="type">string</span>, group []*Config)</span></span> ([]caddy.Server, <span class="type">error</span>) &#123;</span><br><span class="line">    <span class="comment">//....</span></span><br><span class="line"></span><br><span class="line">    <span class="keyword">var</span> servers []caddy.Server</span><br><span class="line">    <span class="keyword">for</span> <span class="keyword">range</span> numSockets &#123;</span><br><span class="line">        <span class="comment">// switch on addr</span></span><br><span class="line">        <span class="keyword">switch</span> tr, _ := parse.Transport(addr); tr &#123;</span><br><span class="line">        <span class="keyword">case</span> transport.DNS:</span><br><span class="line">            s, err := NewServer(addr, group)</span><br><span class="line">            <span class="comment">// ...</span></span><br><span class="line">        <span class="keyword">case</span> transport.TLS:</span><br><span class="line">            s, err := NewServerTLS(addr, group)</span><br><span class="line">            <span class="comment">// ...</span></span><br><span class="line">        <span class="keyword">case</span> transport.QUIC:</span><br><span class="line">            s, err := NewServerQUIC(addr, group)</span><br><span class="line">            <span class="comment">// ...</span></span><br><span class="line">        <span class="keyword">case</span> transport.GRPC:</span><br><span class="line">            s, err := NewServergRPC(addr, group)</span><br><span class="line">            <span class="comment">// ...</span></span><br><span class="line">        <span class="keyword">case</span> transport.HTTPS:</span><br><span class="line">            s, err := NewServerHTTPS(addr, group)</span><br><span class="line">            <span class="comment">// ...</span></span><br><span class="line">        &#125;</span><br><span class="line">    &#125;</span><br><span class="line">    <span class="keyword">return</span> servers, <span class="literal">nil</span></span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure><p>看到这里的 <code>NewServer</code> 方法了吗？它就是我们 Core-DNS 的核心服务器。我们可以继续深入看看它的实现。而 <code>NewServer</code> 创建的 Server 是实现了 <code>caddy.TCPServer</code> 和 <code>caddy.UDPServer</code> 接口的，即是有 <code>Serve</code> 和 <code>ServePacket</code> 方法的。</p><figure class="highlight go"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment">// core/dnsserver/server.go:148</span></span><br><span class="line"><span class="function"><span class="keyword">func</span> <span class="params">(s *Server)</span></span> Serve(l net.Listener) <span class="type">error</span> &#123;</span><br><span class="line">    s.m.Lock()</span><br><span class="line"></span><br><span class="line">    s.server[tcp] = &amp;dns.Server&#123;Listener: l,</span><br><span class="line">        Net:           <span class="string">&quot;tcp&quot;</span>,</span><br><span class="line">        TsigSecret:    s.tsigSecret,</span><br><span class="line">        MaxTCPQueries: tcpMaxQueries,</span><br><span class="line">        ReadTimeout:   s.readTimeout,</span><br><span class="line">        WriteTimeout:  s.writeTimeout,</span><br><span class="line">        IdleTimeout: <span class="function"><span class="keyword">func</span><span class="params">()</span></span> time.Duration &#123;</span><br><span class="line">            <span class="keyword">return</span> s.idleTimeout</span><br><span class="line">        &#125;,</span><br><span class="line">        Handler: dns.HandlerFunc(<span class="function"><span class="keyword">func</span><span class="params">(w dns.ResponseWriter, r *dns.Msg)</span></span> &#123;</span><br><span class="line">            ctx := context.WithValue(context.Background(), Key&#123;&#125;, s)</span><br><span class="line">            ctx = context.WithValue(ctx, LoopKey&#123;&#125;, <span class="number">0</span>)</span><br><span class="line">            s.ServeDNS(ctx, w, r)</span><br><span class="line">        &#125;)&#125;</span><br><span class="line"></span><br><span class="line">    s.m.Unlock()</span><br><span class="line"></span><br><span class="line">    <span class="keyword">return</span> s.server[tcp].ActivateAndServe()</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure><p>到这里，其实对于整个启动过程我们有了一个大致的认识，这里非常绕的原因是因为它的启动依赖于 Caddy ，所以真正的运行是在那边里面，而这里仅仅只是实现了必要的接口而已。<strong>所以有时候看源码单独只是看本仓库，往往会完全不理解在干什么</strong>，这件事告诉我们，有的时候如果看不明或找不到一些入口的时候，可能是因为它依赖了其他仓库的代码。</p><h3 id="插件系统核心"><a href="#插件系统核心" class="headerlink" title="插件系统核心"></a>插件系统核心</h3><p>其实插件本身并不复杂，在这里插件的实现就仅仅是实现接口，然后注册，最后被调用而已。接口是下面这样：</p><figure class="highlight go"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment">// plugin/plugin.go:50</span></span><br><span class="line">Handler <span class="keyword">interface</span> &#123;</span><br><span class="line">    ServeDNS(context.Context, dns.ResponseWriter, *dns.Msg) (<span class="type">int</span>, <span class="type">error</span>)</span><br><span class="line">    Name() <span class="type">string</span></span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure><p>而注册就更简单了</p><figure class="highlight go"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line"><span class="function"><span class="keyword">func</span> <span class="title">init</span><span class="params">()</span></span> &#123; plugin.Register(<span class="string">&quot;whoami&quot;</span>, setup) &#125;</span><br></pre></td></tr></table></figure><p>就是在每个插件的 <code>init</code> 函数中调用 <code>plugin.Register</code> 方法来注册插件。这样在 Core-DNS 启动时就会自动加载这些插件。当然，不要忘记本分，我们今天最重要的目的是看它在 k8s 中是如何配合工作的，所以我们需要关注的是 Kubernetes 插件的实现。</p><h3 id="Kubernetes-插件分析"><a href="#Kubernetes-插件分析" class="headerlink" title="Kubernetes 插件分析"></a>Kubernetes 插件分析</h3><p>相比与其他插件，kubernetes 插件代码就要多的多了。Kubernetes 插件是 Core-DNS 的核心，负责与 K8s API 交互。首先让我们来看到注册的部分：</p><figure class="highlight go"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment">// plugin/kubernetes/setup.go:26</span></span><br><span class="line"><span class="keyword">const</span> pluginName = <span class="string">&quot;kubernetes&quot;</span></span><br><span class="line"></span><br><span class="line"><span class="function"><span class="keyword">func</span> <span class="title">init</span><span class="params">()</span></span> &#123; plugin.Register(pluginName, setup) &#125;</span><br></pre></td></tr></table></figure><p>和其他插件一样，没什么好说的，继续，我们来看看 <code>setup</code> 函数，其中调用了 <code>InitKubeCache</code> 这个方法里面调用了 <code>newdnsController</code> 这是我们的关键：</p><figure class="highlight go"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br><span class="line">29</span><br><span class="line">30</span><br><span class="line">31</span><br><span class="line">32</span><br><span class="line">33</span><br><span class="line">34</span><br><span class="line">35</span><br><span class="line">36</span><br><span class="line">37</span><br><span class="line">38</span><br><span class="line">39</span><br><span class="line">40</span><br><span class="line">41</span><br><span class="line">42</span><br><span class="line">43</span><br><span class="line">44</span><br><span class="line">45</span><br><span class="line">46</span><br><span class="line">47</span><br><span class="line">48</span><br><span class="line">49</span><br><span class="line">50</span><br><span class="line">51</span><br><span class="line">52</span><br><span class="line">53</span><br><span class="line">54</span><br><span class="line">55</span><br><span class="line">56</span><br><span class="line">57</span><br><span class="line">58</span><br><span class="line">59</span><br><span class="line">60</span><br><span class="line">61</span><br><span class="line">62</span><br><span class="line">63</span><br><span class="line">64</span><br><span class="line">65</span><br><span class="line">66</span><br><span class="line">67</span><br><span class="line">68</span><br><span class="line">69</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment">// plugin/kubernetes/controller.go:130</span></span><br><span class="line"><span class="function"><span class="keyword">func</span> <span class="title">newdnsController</span><span class="params">(ctx context.Context, kubeClient kubernetes.Interface, mcsClient mcsClientset.MulticlusterV1alpha1Interface, opts dnsControlOpts)</span></span> *dnsControl &#123;</span><br><span class="line">    dns := dnsControl&#123;</span><br><span class="line">        client:            kubeClient,</span><br><span class="line">        mcsClient:         mcsClient,</span><br><span class="line">        selector:          opts.selector,</span><br><span class="line">        namespaceSelector: opts.namespaceSelector,</span><br><span class="line">        stopCh:            <span class="built_in">make</span>(<span class="keyword">chan</span> <span class="keyword">struct</span>&#123;&#125;),</span><br><span class="line">        zones:             opts.zones,</span><br><span class="line">        endpointNameMode:  opts.endpointNameMode,</span><br><span class="line">        multiclusterZones: opts.multiclusterZones,</span><br><span class="line">    &#125;</span><br><span class="line"></span><br><span class="line">    dns.svcLister, dns.svcController = object.NewIndexerInformer(</span><br><span class="line">        &amp;cache.ListWatch&#123;</span><br><span class="line">            ListFunc:  serviceListFunc(ctx, dns.client, api.NamespaceAll, dns.selector),</span><br><span class="line">            WatchFunc: serviceWatchFunc(ctx, dns.client, api.NamespaceAll, dns.selector),</span><br><span class="line">        &#125;,</span><br><span class="line">        &amp;api.Service&#123;&#125;,</span><br><span class="line">        cache.ResourceEventHandlerFuncs&#123;AddFunc: dns.Add, UpdateFunc: dns.Update, DeleteFunc: dns.Delete&#125;,</span><br><span class="line">        cache.Indexers&#123;svcNameNamespaceIndex: svcNameNamespaceIndexFunc, svcIPIndex: svcIPIndexFunc, svcExtIPIndex: svcExtIPIndexFunc&#125;,</span><br><span class="line">        object.DefaultProcessor(object.ToService, <span class="literal">nil</span>),</span><br><span class="line">    )</span><br><span class="line"></span><br><span class="line">    podLister, podController := object.NewIndexerInformer(</span><br><span class="line">        &amp;cache.ListWatch&#123;</span><br><span class="line">            ListFunc:  podListFunc(ctx, dns.client, api.NamespaceAll, dns.selector),</span><br><span class="line">            WatchFunc: podWatchFunc(ctx, dns.client, api.NamespaceAll, dns.selector),</span><br><span class="line">        &#125;,</span><br><span class="line">        &amp;api.Pod&#123;&#125;,</span><br><span class="line">        cache.ResourceEventHandlerFuncs&#123;AddFunc: dns.Add, UpdateFunc: dns.Update, DeleteFunc: dns.Delete&#125;,</span><br><span class="line">        cache.Indexers&#123;podIPIndex: podIPIndexFunc&#125;,</span><br><span class="line">        object.DefaultProcessor(object.ToPod, <span class="literal">nil</span>),</span><br><span class="line">    )</span><br><span class="line">    dns.podLister = podLister</span><br><span class="line">    <span class="keyword">if</span> opts.initPodCache &#123;</span><br><span class="line">        dns.podController = podController</span><br><span class="line">    &#125;</span><br><span class="line"></span><br><span class="line">    epLister, epController := object.NewIndexerInformer(</span><br><span class="line">        &amp;cache.ListWatch&#123;</span><br><span class="line">            ListFunc:  endpointSliceListFunc(ctx, dns.client, api.NamespaceAll, dns.selector),</span><br><span class="line">            WatchFunc: endpointSliceWatchFunc(ctx, dns.client, api.NamespaceAll, dns.selector),</span><br><span class="line">        &#125;,</span><br><span class="line">        &amp;discovery.EndpointSlice&#123;&#125;,</span><br><span class="line">        cache.ResourceEventHandlerFuncs&#123;AddFunc: dns.Add, UpdateFunc: dns.Update, DeleteFunc: dns.Delete&#125;,</span><br><span class="line">        cache.Indexers&#123;epNameNamespaceIndex: epNameNamespaceIndexFunc, epIPIndex: epIPIndexFunc&#125;,</span><br><span class="line">        object.DefaultProcessor(object.EndpointSliceToEndpoints, dns.EndpointSliceLatencyRecorder()),</span><br><span class="line">    )</span><br><span class="line">    dns.epLister = epLister</span><br><span class="line">    <span class="keyword">if</span> opts.initEndpointsCache &#123;</span><br><span class="line">        dns.epController = epController</span><br><span class="line">    &#125;</span><br><span class="line"></span><br><span class="line">    dns.nsLister, dns.nsController = object.NewIndexerInformer(</span><br><span class="line">        &amp;cache.ListWatch&#123;</span><br><span class="line">            ListFunc:  namespaceListFunc(ctx, dns.client, dns.namespaceSelector),</span><br><span class="line">            WatchFunc: namespaceWatchFunc(ctx, dns.client, dns.namespaceSelector),</span><br><span class="line">        &#125;,</span><br><span class="line">        &amp;api.Namespace&#123;&#125;,</span><br><span class="line">        cache.ResourceEventHandlerFuncs&#123;&#125;,</span><br><span class="line">        cache.Indexers&#123;&#125;,</span><br><span class="line">        object.DefaultProcessor(object.ToNamespace, <span class="literal">nil</span>),</span><br><span class="line">    )</span><br><span class="line"></span><br><span class="line">    <span class="comment">//....</span></span><br><span class="line"></span><br><span class="line">    <span class="keyword">return</span> &amp;dns</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure><p>看到熟悉的 <code>Informer</code> 了吗？它是 Kubernetes 的核心组件之一，用于监听和缓存 Kubernetes API 对象的变化。这里我们创建了多个 <code>Informer</code>，分别用于监听 Service、Pod、Endpoints 和 Namespace 的变化。我们知道，不管是在同一个 namespace 还是不同 namespace 下，Pod 都可以通过 DNS 名称访问其他 Pod。所以这些不同的事件变动都要监听。而这几个资源的变化都会触发 注册的 eventHandler 也就是 <code>Add</code>、<code>Update</code> 和 <code>Delete</code> 方法。</p><p>你一定以为 <code>Add</code>、<code>Update</code> 和 <code>Delete</code> 这些方法会具体处理数据？但其实你实际去看看，他们其实本质都是去更新了一下时间戳而已。而真正的数据 cache 在 cache.Indexers 里面，在 <code>dnsControl</code> 中有几个 <code>cache.Indexer</code> 专门用了放他们。而本地缓存的意义就在于避免去频繁的访问 Kubernetes API Server，这是显而易见的。</p><p>其中有一个 <code>podLister cache.Indexer</code> 我们后面还会看到。</p><h3 id="ServeDNS"><a href="#ServeDNS" class="headerlink" title="ServeDNS"></a>ServeDNS</h3><p>最后我们来看看请求来的时候是如何处理的，而这里的关键则就在与 <code>ServeDNS</code> 方法了</p><figure class="highlight go"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment">// plugin/kubernetes/handler.go:13</span></span><br><span class="line"><span class="function"><span class="keyword">func</span> <span class="params">(k Kubernetes)</span></span> ServeDNS(ctx context.Context, w dns.ResponseWriter, r *dns.Msg) (<span class="type">int</span>, <span class="type">error</span>) &#123;</span><br><span class="line">    <span class="comment">//....</span></span><br><span class="line">    <span class="keyword">switch</span> state.QType() &#123;</span><br><span class="line">    <span class="keyword">case</span> dns.TypeA:</span><br><span class="line">        records, truncated, err = plugin.A(ctx, &amp;k, zone, state, <span class="literal">nil</span>, plugin.Options&#123;&#125;)</span><br><span class="line">    <span class="keyword">case</span> dns.TypeAAAA:</span><br><span class="line">        records, truncated, err = plugin.AAAA(ctx, &amp;k, zone, state, <span class="literal">nil</span>, plugin.Options&#123;&#125;)</span><br><span class="line">    <span class="keyword">case</span> dns.TypeTXT:</span><br><span class="line">        records, truncated, err = plugin.TXT(ctx, &amp;k, zone, state, <span class="literal">nil</span>, plugin.Options&#123;&#125;)</span><br><span class="line">    <span class="keyword">case</span> dns.TypeCNAME:</span><br><span class="line">        records, err = plugin.CNAME(ctx, &amp;k, zone, state, plugin.Options&#123;&#125;)</span><br><span class="line">    <span class="keyword">case</span> dns.TypePTR:</span><br><span class="line">        records, err = plugin.PTR(ctx, &amp;k, zone, state, plugin.Options&#123;&#125;)</span><br><span class="line">    <span class="keyword">case</span> dns.TypeMX:</span><br><span class="line">        records, extra, err = plugin.MX(ctx, &amp;k, zone, state, plugin.Options&#123;&#125;)</span><br><span class="line">    <span class="keyword">case</span> dns.TypeSRV:</span><br><span class="line">        records, extra, err = plugin.SRV(ctx, &amp;k, zone, state, plugin.Options&#123;&#125;)</span><br><span class="line">    <span class="keyword">case</span> dns.TypeSOA:</span><br><span class="line">        <span class="keyword">if</span> qname == zone &#123;</span><br><span class="line">            records, err = plugin.SOA(ctx, &amp;k, zone, state, plugin.Options&#123;&#125;)</span><br><span class="line">        &#125;</span><br><span class="line">    <span class="keyword">case</span> dns.TypeAXFR, dns.TypeIXFR:</span><br><span class="line">        <span class="keyword">return</span> dns.RcodeRefused, <span class="literal">nil</span></span><br><span class="line"></span><br><span class="line">    <span class="comment">//....</span></span><br><span class="line">    <span class="keyword">return</span> dns.RcodeSuccess, <span class="literal">nil</span></span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure><p>可以看到，这里会根据不同的 DNS 查询类型来调用不同的处理方法。比如对于 A 记录查询，会调用 <code>plugin.A</code> 方法，而对于 AAAA 记录查询，则会调用 <code>plugin.AAAA</code> 方法。这些方法会根据当前的状态和查询条件来返回相应的 DNS 记录。</p><p>而最终不同的方法都会转回到 <code>Services</code> 方法中，通过 <code>k.Records(ctx, state, false)</code> 最后我们可以确认我们找到了，<code>findPods</code> 方法</p><figure class="highlight go"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment">// plugin/kubernetes/kubernetes.go:95</span></span><br><span class="line"><span class="function"><span class="keyword">func</span> <span class="params">(k *Kubernetes)</span></span> Services(ctx context.Context, state request.Request, exact <span class="type">bool</span>, opt plugin.Options) (svcs []msg.Service, err <span class="type">error</span>) &#123;</span><br><span class="line">    <span class="comment">//...</span></span><br><span class="line">    s, e := k.Records(ctx, state, <span class="literal">false</span>)</span><br><span class="line">    <span class="comment">//...</span></span><br><span class="line">    <span class="keyword">return</span> internal, e</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure><figure class="highlight go"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment">// plugin/kubernetes/kubernetes.go:310</span></span><br><span class="line"><span class="function"><span class="keyword">func</span> <span class="params">(k *Kubernetes)</span></span> Records(ctx context.Context, state request.Request, exact <span class="type">bool</span>) ([]msg.Service, <span class="type">error</span>) &#123;</span><br><span class="line">    <span class="comment">//...</span></span><br><span class="line">    <span class="keyword">if</span> r.podOrSvc == Pod &#123;</span><br><span class="line">        pods, err := k.findPods(r, state.Zone)</span><br><span class="line">        <span class="keyword">return</span> pods, err</span><br><span class="line">    &#125;</span><br><span class="line"></span><br><span class="line">    <span class="keyword">var</span> services []msg.Service</span><br><span class="line">    <span class="keyword">var</span> err <span class="type">error</span></span><br><span class="line">    <span class="keyword">if</span> !multicluster &#123;</span><br><span class="line">        services, err = k.findServices(r, state.Zone)</span><br><span class="line">    &#125; <span class="keyword">else</span> &#123;</span><br><span class="line">        services, err = k.findMultiClusterServices(r, state.Zone)</span><br><span class="line">    &#125;</span><br><span class="line">    <span class="keyword">return</span> services, err</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure><p>而 <code>findPods</code> 当我看到这个名字的时候我就知道距离胜利不远了。</p><figure class="highlight go"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment">// plugin/kubernetes/kubernetes.go:359</span></span><br><span class="line"><span class="function"><span class="keyword">func</span> <span class="params">(k *Kubernetes)</span></span> findPods(r recordRequest, zone <span class="type">string</span>) (pods []msg.Service, err <span class="type">error</span>) &#123;</span><br><span class="line">    <span class="comment">//...</span></span><br><span class="line"></span><br><span class="line">    zonePath := msg.Path(zone, coredns)</span><br><span class="line"></span><br><span class="line">    <span class="keyword">var</span> ip <span class="type">string</span></span><br><span class="line">    <span class="keyword">if</span> strings.Count(podname, <span class="string">&quot;-&quot;</span>) == <span class="number">3</span> &amp;&amp; !strings.Contains(podname, <span class="string">&quot;--&quot;</span>) &#123;</span><br><span class="line">        ip = strings.ReplaceAll(podname, <span class="string">&quot;-&quot;</span>, <span class="string">&quot;.&quot;</span>)</span><br><span class="line">    &#125; <span class="keyword">else</span> &#123;</span><br><span class="line">        ip = strings.ReplaceAll(podname, <span class="string">&quot;-&quot;</span>, <span class="string">&quot;:&quot;</span>)</span><br><span class="line">    &#125;</span><br><span class="line"></span><br><span class="line">    <span class="comment">//...</span></span><br><span class="line"></span><br><span class="line">    <span class="keyword">for</span> _, p := <span class="keyword">range</span> k.APIConn.PodIndex(ip) &#123;</span><br><span class="line">        <span class="comment">// check for matching ip and namespace</span></span><br><span class="line">        <span class="keyword">if</span> ip == p.PodIP &amp;&amp; match(namespace, p.Namespace) &#123;</span><br><span class="line">            s := msg.Service&#123;Key: strings.Join([]<span class="type">string</span>&#123;zonePath, Pod, namespace, podname&#125;, <span class="string">&quot;/&quot;</span>), Host: ip, TTL: k.ttl&#125;</span><br><span class="line">            pods = <span class="built_in">append</span>(pods, s)</span><br><span class="line"></span><br><span class="line">            err = <span class="literal">nil</span></span><br><span class="line">        &#125;</span><br><span class="line">    &#125;</span><br><span class="line">    <span class="keyword">return</span> pods, err</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure><p>显然这里就是对于请求的 Pod 的 IP 地址进行查询。我们只需要去看 <code>PodIndex</code> 方法就可以了。</p><figure class="highlight go"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment">// plugin/kubernetes/controller.go:504</span></span><br><span class="line"><span class="function"><span class="keyword">func</span> <span class="params">(dns *dnsControl)</span></span> PodIndex(ip <span class="type">string</span>) (pods []*object.Pod) &#123;</span><br><span class="line">    os, err := dns.podLister.ByIndex(podIPIndex, ip)</span><br><span class="line">    <span class="keyword">if</span> err != <span class="literal">nil</span> &#123;</span><br><span class="line">        <span class="keyword">return</span> <span class="literal">nil</span></span><br><span class="line">    &#125;</span><br><span class="line">    <span class="keyword">for</span> _, o := <span class="keyword">range</span> os &#123;</span><br><span class="line">        p, ok := o.(*object.Pod)</span><br><span class="line">        <span class="keyword">if</span> !ok &#123;</span><br><span class="line">            <span class="keyword">continue</span></span><br><span class="line">        &#125;</span><br><span class="line">        pods = <span class="built_in">append</span>(pods, p)</span><br><span class="line">    &#125;</span><br><span class="line">    <span class="keyword">return</span> pods</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure><p>而这个方法中的 <code>podLister</code> 就是我们之前看到的 <code>podLister cache.Indexer</code> 。它就是 Indexer 缓存。里面存放的就是 Pod 的信息。而 <code>ByIndex</code> 方法则是根据索引来查询 Pod 的信息。</p><p>至此整个链路就全部串起来了。</p><h2 id="码后解答"><a href="#码后解答" class="headerlink" title="码后解答"></a>码后解答</h2><ol><li><code>Core-DNS</code> 是如何知道 pod 的 IP 地址的？</li></ol><ul><li>还是老一套 <code>Informer</code> 机制而已，<code>Core-DNS</code> 通过 Kubernetes 的 API Server 获取 Pod 的信息，并将其缓存到本地的 <code>cache.Indexer</code> 中。每当 Pod 的信息发生变化时，相关的 <code>Informer</code> 会触发事件，更新缓存中的 Pod 信息。</li></ul><ol start="2"><li><code>Core-DNS</code> 是如何处理 DNS 查询的？</li></ol><ul><li>其实本质就是启动一个 DNS 服务，只不过它对于 DNS 查询会处理 k8s 中的服务和 Pod 的域名解析而已。</li></ul><ol start="3"><li><code>Core-DNS</code> 有什么优化措施来提高性能？</li></ol><ul><li>cache 机制，<code>Core-DNS</code> 使用了本地缓存来存储 Pod 和 Service 的信息，避免频繁访问 Kubernetes API Server。通过 <code>Informer</code> 机制监听资源变化，并更新本地缓存，从而提高查询性能。</li></ul><h2 id="总结提升"><a href="#总结提升" class="headerlink" title="总结提升"></a>总结提升</h2><h3 id="插件机制"><a href="#插件机制" class="headerlink" title="插件机制"></a>插件机制</h3><p>其实 Core DNS 本质里面有两种模式在里面，一个是对于 caddy 的套壳，它完全是利用了 caddy 作为了一个启动器，虽然从代码层面来说减少了很多项目启动运行部分的代码，但是实际中我们也看到了，它已经删除了 caddy 许多功能，完全与主干已经脱节了。所以其实它完全可以把那部分直接合并过来作为一个项目里面。我们在看代码的时候会发现跳来跳去，非常的麻烦。对于新人确实不好理解。而另一个模式是 plugin 机制，利用 golang 中的 init 方法实现注册，只要 import 了就会自动注册。很多插件系统的设计也都是如此。</p><h3 id="Informer"><a href="#Informer" class="headerlink" title="Informer"></a>Informer</h3><p>最后提一次它吧，一路看到这里，我相信你已经明白为什么很多人吹 <code>Informer</code> 了。它的设计确实非常的巧妙，利用了缓存和事件驱动的方式来处理 Kubernetes 中的资源变化，而且无论是内部组件还是外部组件都可以用它。通过 <code>Informer</code>，我们可以轻松地监听和处理资源的增删改查，而不需要频繁地访问 API Server。这种设计大大提高了性能和效率。而最关键的是利用了它，所有其他想要扩展 Kubernetes 的功能都可以利用它一方面是简化了开发工作，一方面接入行为也统一。</p>]]>
    </content>
    
    
      
      
        
        
    <summary type="html">&lt;blockquote&gt;
&lt;p&gt;📢 注意，该文本非最终版本，正在更新中，版权所有，请勿转载！！&lt;/p&gt;
&lt;/blockquote&gt;
&lt;h2 id=&quot;前言&quot;&gt;&lt;a href=&quot;#前言&quot; class=&quot;headerlink&quot; title=&quot;前言&quot;&gt;&lt;/a&gt;前言&lt;/h2&gt;&lt;p&gt;在</summary>
        
      
    
    
    
    <category term="kubernetes" scheme="https://www.linkinstars.com/categories/kubernetes/"/>
    
    
    <category term="kubernetes" scheme="https://www.linkinstars.com/tags/kubernetes/"/>
    
  </entry>
  
  <entry>
    <title>《一起读 kubernetes 源码》ingress-nginx 是如何工作的</title>
    <link href="https://www.linkinstars.com/post/3f252c5a.html"/>
    <id>https://www.linkinstars.com/post/3f252c5a.html</id>
    <published>2025-03-14T16:00:00.000Z</published>
    <updated>2025-03-26T07:25:48.676Z</updated>
    
    <content type="html">
    <![CDATA[<a href="https://www.linkinstars.com/post/3f252c5a.html">RSS 阅读体验可能不太好，若喜欢本文请点此跳转原文查看~</a><br><br>]]>
    <![CDATA[<blockquote><p>📢 注意，该文本非最终版本，正在更新中，版权所有，请勿转载！！</p></blockquote><h2 id="前言"><a href="#前言" class="headerlink" title="前言"></a>前言</h2><p>前面说完了 Service 和 kube-proxy，那就自然轮到了我们的 Ingress 了。Ingress 作为 Kubernetes 中负责外部流量路由的重要组件，其实现原理和设计思路值得我们深入研究。当然，这里也要说明的是，对于 Ingress 其实在这个版本里面已经有一点点 “落伍” ，官方更为推荐的是使用 <code>Kubernetes Gateway API</code> 来做同样的事情，并且能力更为强大。支持更丰富的协议，更精细化的流量控制，以及对于权限的控制等等。不过在这里不涉及这个部分，感兴趣同学可以自行去查看相关源码。</p><h2 id="前置知识"><a href="#前置知识" class="headerlink" title="前置知识"></a>前置知识</h2><ul><li>Ingress</li><li>Ingress Controller</li></ul><h2 id="心路历程"><a href="#心路历程" class="headerlink" title="心路历程"></a>心路历程</h2><p>之前我也写过对于 Ingress 的基础概念的说明，我称为 service 的 service。其实这个对象的出现是一种顺理成章的事情，对于一个多业务场景下的方案来说，不同域名访问不同服务，一定会需要一个网关这样的角色。而对于大的场景还是小的场景都需要不同的网关去做处理，但实际的能力大差不差。我们最熟悉的就是 <a href="https://github.com/kubernetes/ingress-nginx">https://github.com/kubernetes/ingress-nginx</a> 。所以本文也将以它为源码分析的对象。</p><h2 id="码前提问"><a href="#码前提问" class="headerlink" title="码前提问"></a>码前提问</h2><p>同样的，在开始源码分析之前，让我们先思考几个问题：</p><ol><li>ingress-nginx 本身究竟是个什么东西？</li><li>Ingress Controller 是如何监听到 Ingress 资源的变化的？</li><li>Ingress Controller 是如何将规则转换为实际的负载均衡配置的？</li></ol><h2 id="源码分析"><a href="#源码分析" class="headerlink" title="源码分析"></a>源码分析</h2><h3 id="Ingress-数据结构"><a href="#Ingress-数据结构" class="headerlink" title="Ingress 数据结构"></a>Ingress 数据结构</h3><p>先看一下我们熟悉的 yaml 配置</p><figure class="highlight yaml"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br></pre></td><td class="code"><pre><span class="line"><span class="attr">rules:</span></span><br><span class="line">    <span class="bullet">-</span> <span class="attr">host:</span> <span class="string">api.linkinstars.com</span></span><br><span class="line">      <span class="attr">http:</span></span><br><span class="line">        <span class="attr">paths:</span></span><br><span class="line">          <span class="bullet">-</span> <span class="attr">path:</span> <span class="string">/user</span></span><br><span class="line">            <span class="attr">pathType:</span> <span class="string">Prefix</span></span><br><span class="line">            <span class="attr">backend:</span></span><br><span class="line">              <span class="attr">service:</span></span><br><span class="line">                <span class="attr">name:</span> <span class="string">user</span></span><br><span class="line">                <span class="attr">port:</span></span><br><span class="line">                  <span class="attr">number:</span> <span class="number">8080</span></span><br></pre></td></tr></table></figure><p>数据结构的部分，我相信你很容易可以找到对应的部分。</p><figure class="highlight go"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment">// staging/src/k8s.io/api/networking/v1beta1/types.go:35</span></span><br><span class="line"><span class="keyword">type</span> Ingress <span class="keyword">struct</span> &#123;</span><br><span class="line">    metav1.TypeMeta <span class="string">`json:&quot;,inline&quot;`</span></span><br><span class="line">    metav1.ObjectMeta <span class="string">`json:&quot;metadata,omitempty&quot; protobuf:&quot;bytes,1,opt,name=metadata&quot;`</span></span><br><span class="line">    Spec IngressSpec <span class="string">`json:&quot;spec,omitempty&quot; protobuf:&quot;bytes,2,opt,name=spec&quot;`</span></span><br><span class="line">    Status IngressStatus <span class="string">`json:&quot;status,omitempty&quot; protobuf:&quot;bytes,3,opt,name=status&quot;`</span></span><br><span class="line">&#125;</span><br><span class="line"></span><br><span class="line"><span class="comment">// staging/src/k8s.io/api/networking/v1beta1/types.go:73</span></span><br><span class="line"><span class="keyword">type</span> IngressSpec <span class="keyword">struct</span> &#123;</span><br><span class="line">    IngressClassName *<span class="type">string</span> <span class="string">`json:&quot;ingressClassName,omitempty&quot; protobuf:&quot;bytes,4,opt,name=ingressClassName&quot;`</span></span><br><span class="line">    Backend *IngressBackend <span class="string">`json:&quot;backend,omitempty&quot; protobuf:&quot;bytes,1,opt,name=backend&quot;`</span></span><br><span class="line">    TLS []IngressTLS <span class="string">`json:&quot;tls,omitempty&quot; protobuf:&quot;bytes,2,rep,name=tls&quot;`</span></span><br><span class="line">    Rules []IngressRule <span class="string">`json:&quot;rules,omitempty&quot; protobuf:&quot;bytes,3,rep,name=rules&quot;`</span></span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure><p>所以 Ingress 对象本身很简单，就是记录路由的规则，而这个规则则是 <code>Ingress Controller</code> 最为关注的东西。下面我们就来看看我们今天的重点，也就是 <code>ingress-nginx</code> 的具体实现。注意下面的源码都来自于它，而不是 k8s 主项目中。</p><h3 id="ingress-nginx"><a href="#ingress-nginx" class="headerlink" title="ingress-nginx"></a>ingress-nginx</h3><p>如果你对于 <code>ingress-nginx</code> 不是特别熟悉，强烈建议使用一次，一次就懂。然后可以先看：<code>ingress-nginx/deploy/static/provider/cloud/deploy.yaml</code> 部署。看了这个文件其实对于它本身你就觉得不是什么很可怕的东西了。其中抛开角色、权限、配置相关对象之外就只有：</p><ul><li>一个名叫 <code>ingress-nginx-controller</code> 的 Service</li><li>一个名叫 <code>ingress-nginx-controller</code> 的 Deployment</li><li>一个名叫 <code>nginx</code> 的 IngressClass</li></ul><p>其他可以忽略，重点其实就是他们几个。其实看到 <code>Deployment</code> 的时候，我就已经一下子明白了。<strong>其实本质它还是一个 <code>Deployment</code> 服务而已</strong>，而 <code>Service</code> 是一个 <code>LoadBalancer</code> 。相信到此你应该已经了解基本的结构了，在没有看源码的时候你就可以大胆去猜测了，可能它就是通过一个服务去获取路由规则，而通过 LB 把流量接进来，通过 nginx 的实例将流量按规则去路由而已。而得益于 nginx 本身的路由强大，性能通常也没有什么问题。</p><p>那么本文的两个关键问题就来了：</p><ul><li>ingress-nginx 如何知道规则变动了？</li><li>ingress-nginx 如何把 ingress 的规则转换为 nginx 的规则？</li></ul><h3 id="Ingress-Controller-实现原理"><a href="#Ingress-Controller-实现原理" class="headerlink" title="Ingress Controller 实现原理"></a>Ingress Controller 实现原理</h3><p>还是老套路，由于这是一个独立的服务，所以我们直接从入口着手：</p><h4 id="入口到对象"><a href="#入口到对象" class="headerlink" title="入口到对象"></a>入口到对象</h4><p>入口特别好找，和大多数项目一样就在 cmd 下面，main 的函数开始。其中我对于入口函数精简了绝大多数代码，留下了与我们最相关的部分</p><figure class="highlight go"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment">// cmd/nginx/main.go:53</span></span><br><span class="line"><span class="function"><span class="keyword">func</span> <span class="title">main</span><span class="params">()</span></span> &#123;</span><br><span class="line">    kubeClient, err := createApiserverClient(conf.APIServerHost, conf.RootCAFile, conf.KubeConfigFile)</span><br><span class="line"></span><br><span class="line">    _, err = kubeClient.NetworkingV1().IngressClasses().List(context.TODO(), metav1.ListOptions&#123;&#125;)</span><br><span class="line">    <span class="keyword">if</span> err != <span class="literal">nil</span> &#123;</span><br><span class="line">        <span class="keyword">if</span> !errors.IsNotFound(err) &#123;</span><br><span class="line">            <span class="keyword">if</span> errors.IsForbidden(err) &#123;</span><br><span class="line">                klog.Warningf(<span class="string">&quot;No permissions to list and get Ingress Classes: %v, IngressClass feature will be disabled&quot;</span>, err)</span><br><span class="line">                conf.IngressClassConfiguration.IgnoreIngressClass = <span class="literal">true</span></span><br><span class="line">            &#125;</span><br><span class="line">        &#125;</span><br><span class="line">    &#125;</span><br><span class="line">    conf.Client = kubeClient</span><br><span class="line"></span><br><span class="line">    err = k8s.GetIngressPod(kubeClient)</span><br><span class="line">    <span class="keyword">if</span> err != <span class="literal">nil</span> &#123;</span><br><span class="line">        klog.Fatalf(<span class="string">&quot;Unexpected error obtaining ingress-nginx pod: %v&quot;</span>, err)</span><br><span class="line">    &#125;</span><br><span class="line"></span><br><span class="line">    ngx := controller.NewNGINXController(conf, mc)</span><br><span class="line"></span><br><span class="line">    <span class="keyword">go</span> ngx.Start()</span><br><span class="line"></span><br><span class="line">    process.HandleSigterm(ngx, conf.PostShutdownGracePeriod, <span class="function"><span class="keyword">func</span><span class="params">(code <span class="type">int</span>)</span></span> &#123;</span><br><span class="line">        os.Exit(code)</span><br><span class="line">    &#125;)</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure><p>其中我们可以看到：</p><ol><li>通过 <code>createApiserverClient</code> 创建了一个 <code>kubeClient</code> ，然后尝试调用 API 拉取了一下 <code>Ingress</code> 的信息，如果拉取不到则说明没有权限。</li><li>然后获取了一下 <code>IngressPod</code>。</li><li>最重要的就是通过 <code>controller.NewNGINXController</code> 创建我们的主角 <code>controller</code> 并通过 <code>Start</code> 启动起来。</li></ol><p>啊哈，其实从这里的 ApiserverClient 你就已经大概能猜到如何拿到相关信息的了，当然，所有其实 k8s 相关的扩展能力组件都与这个 apiserver 有着关系，因为就是通过它来控制对象或者获取相关资源的信息。</p><p>最后 <code>process.HandleSigterm</code> 其实这部分也是可以抄的，我们放在最后说。</p><h4 id="初始化"><a href="#初始化" class="headerlink" title="初始化"></a>初始化</h4><p>首先来看看 <code>NewNGINXController</code> 创建的时候究竟干了什么。</p><figure class="highlight go"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br><span class="line">29</span><br><span class="line">30</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment">// internal/ingress/controller/nginx.go:76</span></span><br><span class="line"><span class="function"><span class="keyword">func</span> <span class="title">NewNGINXController</span><span class="params">(config *Configuration, mc metric.Collector)</span></span> *NGINXController &#123;</span><br><span class="line">    <span class="comment">// ....</span></span><br><span class="line"></span><br><span class="line">    n := &amp;NGINXController&#123;</span><br><span class="line">        <span class="comment">// ....</span></span><br><span class="line">    &#125;</span><br><span class="line"></span><br><span class="line">    <span class="comment">// ....</span></span><br><span class="line"></span><br><span class="line">    n.store = store.New(</span><br><span class="line">        config.Namespace,</span><br><span class="line">        config.WatchNamespaceSelector,</span><br><span class="line">        config.ConfigMapName,</span><br><span class="line">        config.TCPConfigMapName,</span><br><span class="line">        config.UDPConfigMapName,</span><br><span class="line">        config.DefaultSSLCertificate,</span><br><span class="line">        config.ResyncPeriod,</span><br><span class="line">        config.Client,</span><br><span class="line">        n.updateCh,</span><br><span class="line">        config.DisableCatchAll,</span><br><span class="line">        config.DeepInspector,</span><br><span class="line">        config.IngressClassConfiguration,</span><br><span class="line">        config.DisableSyncEvents)</span><br><span class="line"></span><br><span class="line">    n.syncQueue = task.NewTaskQueue(n.syncIngress) <span class="comment">// 注意一下这里的这个，后面有用</span></span><br><span class="line">    <span class="comment">// ...</span></span><br><span class="line"></span><br><span class="line">    <span class="keyword">return</span> n</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure><p>有关 nginx 的部分以及模板的部分都不是我们的重点，因为我们关心的还是 k8s 上，于是在仔细寻找后发现有一个 store 的对象让我找到了关键点。</p><p>你别说，这个 <code>store.New</code> 有足足 600+ 行的代码，都看是不可能都看的。</p><blockquote><p>PS：注释特别加了 nolint:gocyclo，可见复杂的别人都懒的动了，不然对于这样长的函数还是拆分为多个可读性会更好</p></blockquote><p>为什么这么长呢？原因其实是因为写了一大堆的闭包在里面，而其实最重要的放在了最后</p><figure class="highlight go"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment">// internal/ingress/controller/store/store.go:850</span></span><br><span class="line"><span class="keyword">if</span> _, err := store.informers.Ingress.AddEventHandler(ingEventHandler); err != <span class="literal">nil</span> &#123;</span><br><span class="line">    klog.Errorf(<span class="string">&quot;Error adding ingress event handler: %v&quot;</span>, err)</span><br><span class="line">&#125;</span><br><span class="line"><span class="keyword">if</span> !icConfig.IgnoreIngressClass &#123;</span><br><span class="line">    <span class="keyword">if</span> _, err := store.informers.IngressClass.AddEventHandler(ingressClassEventHandler); err != <span class="literal">nil</span> &#123;</span><br><span class="line">        klog.Errorf(<span class="string">&quot;Error adding ingress class event handler: %v&quot;</span>, err)</span><br><span class="line">    &#125;</span><br><span class="line">&#125;</span><br><span class="line"><span class="keyword">if</span> _, err := store.informers.EndpointSlice.AddEventHandler(epsEventHandler); err != <span class="literal">nil</span> &#123;</span><br><span class="line">    klog.Errorf(<span class="string">&quot;Error adding endpoint slice event handler: %v&quot;</span>, err)</span><br><span class="line">&#125;</span><br><span class="line"><span class="keyword">if</span> _, err := store.informers.Secret.AddEventHandler(secrEventHandler); err != <span class="literal">nil</span> &#123;</span><br><span class="line">    klog.Errorf(<span class="string">&quot;Error adding secret event handler: %v&quot;</span>, err)</span><br><span class="line">&#125;</span><br><span class="line"><span class="keyword">if</span> _, err := store.informers.ConfigMap.AddEventHandler(cmEventHandler); err != <span class="literal">nil</span> &#123;</span><br><span class="line">    klog.Errorf(<span class="string">&quot;Error adding configmap event handler: %v&quot;</span>, err)</span><br><span class="line">&#125;</span><br><span class="line"><span class="keyword">if</span> _, err := store.informers.Service.AddEventHandler(serviceHandler); err != <span class="literal">nil</span> &#123;</span><br><span class="line">    klog.Errorf(<span class="string">&quot;Error adding service event handler: %v&quot;</span>, err)</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure><p>看到这里其实就很清楚了，关键是就实现了 informer 的各种 event handler，然后对于各种事件做了不同的处理。其中就有 ingress 的事件。而在对应的 <code>ingEventHandler</code> 处理方法中：</p><figure class="highlight go"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br></pre></td><td class="code"><pre><span class="line">ingEventHandler := cache.ResourceEventHandlerFuncs&#123;</span><br><span class="line">    AddFunc: <span class="function"><span class="keyword">func</span><span class="params">(obj <span class="keyword">interface</span>&#123;&#125;)</span></span> &#123;</span><br><span class="line">        ing, _ := toIngress(obj)</span><br><span class="line"></span><br><span class="line">        <span class="comment">//...</span></span><br><span class="line"></span><br><span class="line">        store.syncIngress(ing)</span><br><span class="line">        store.updateSecretIngressMap(ing)</span><br><span class="line">        store.syncSecrets(ing)</span><br><span class="line"></span><br><span class="line">        updateCh.In() &lt;- Event&#123;</span><br><span class="line">            Type: CreateEvent,</span><br><span class="line">            Obj:  obj,</span><br><span class="line">        &#125;</span><br><span class="line">    &#125;,</span><br></pre></td></tr></table></figure><p>关键就是将各种事件放到 <code>updateCh</code> 里面去。这里的这个 <code>updateCh</code> 又是我们可以学习的一个小点，它其实是一个 <code>RingChannel</code> 并且其实这库已经不更新了，但这里依旧沿用了，说明一个 10 年前设计的库还是很好用的，特别是它对于暴露功能的设计还是有迹可循的，有兴趣可以看看。而 <code>updateCh</code> 是在哪里消费的呢？其实这一点你通过看之前的 k8s 源码应该有所熟悉了，消费的地方正式在启动的时候设置好了。</p><h4 id="启动"><a href="#启动" class="headerlink" title="启动"></a>启动</h4><p><code>Start</code> 并不复杂，精简一下如下：</p><figure class="highlight go"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br><span class="line">29</span><br><span class="line">30</span><br><span class="line">31</span><br><span class="line">32</span><br><span class="line">33</span><br><span class="line">34</span><br><span class="line">35</span><br><span class="line">36</span><br><span class="line">37</span><br><span class="line">38</span><br><span class="line">39</span><br><span class="line">40</span><br><span class="line">41</span><br><span class="line">42</span><br><span class="line">43</span><br><span class="line">44</span><br><span class="line">45</span><br><span class="line">46</span><br><span class="line">47</span><br><span class="line">48</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment">// internal/ingress/controller/nginx.go:270</span></span><br><span class="line"><span class="function"><span class="keyword">func</span> <span class="params">(n *NGINXController)</span></span> Start() &#123;</span><br><span class="line">    klog.InfoS(<span class="string">&quot;Starting NGINX Ingress controller&quot;</span>)</span><br><span class="line"></span><br><span class="line">    n.store.Run(n.stopCh)</span><br><span class="line">    cmd := n.command.ExecCommand()</span><br><span class="line"></span><br><span class="line">    n.start(cmd)</span><br><span class="line"></span><br><span class="line">    <span class="keyword">go</span> n.syncQueue.Run(time.Second, n.stopCh)</span><br><span class="line">    <span class="comment">// force initial sync</span></span><br><span class="line">    n.syncQueue.EnqueueTask(task.GetDummyObject(<span class="string">&quot;initial-sync&quot;</span>))</span><br><span class="line"></span><br><span class="line">    <span class="keyword">for</span> &#123;</span><br><span class="line">        <span class="keyword">select</span> &#123;</span><br><span class="line">        <span class="keyword">case</span> err := &lt;-n.ngxErrCh:</span><br><span class="line">            <span class="keyword">if</span> n.isShuttingDown &#123;</span><br><span class="line">                <span class="keyword">return</span></span><br><span class="line">            &#125;</span><br><span class="line"></span><br><span class="line">            <span class="comment">// if the nginx master process dies, the workers continue to process requests</span></span><br><span class="line">            <span class="comment">// until the failure of the configured livenessProbe and restart of the pod.</span></span><br><span class="line">            <span class="keyword">if</span> process.IsRespawnIfRequired(err) &#123;</span><br><span class="line">                <span class="keyword">return</span></span><br><span class="line">            &#125;</span><br><span class="line"></span><br><span class="line">        <span class="keyword">case</span> event := &lt;-n.updateCh.Out():</span><br><span class="line">            <span class="keyword">if</span> n.isShuttingDown &#123;</span><br><span class="line">                <span class="keyword">break</span></span><br><span class="line">            &#125;</span><br><span class="line"></span><br><span class="line">            <span class="keyword">if</span> evt, ok := event.(store.Event); ok &#123;</span><br><span class="line">                klog.V(<span class="number">3</span>).InfoS(<span class="string">&quot;Event received&quot;</span>, <span class="string">&quot;type&quot;</span>, evt.Type, <span class="string">&quot;object&quot;</span>, evt.Obj)</span><br><span class="line">                <span class="keyword">if</span> evt.Type == store.ConfigurationEvent &#123;</span><br><span class="line">                    <span class="comment">// <span class="doctag">TODO:</span> is this necessary? Consider removing this special case</span></span><br><span class="line">                    n.syncQueue.EnqueueTask(task.GetDummyObject(<span class="string">&quot;configmap-change&quot;</span>))</span><br><span class="line">                    <span class="keyword">continue</span></span><br><span class="line">                &#125;</span><br><span class="line"></span><br><span class="line">                n.syncQueue.EnqueueSkippableTask(evt.Obj)</span><br><span class="line">            &#125; <span class="keyword">else</span> &#123;</span><br><span class="line">                klog.Warningf(<span class="string">&quot;Unexpected event type received %T&quot;</span>, event)</span><br><span class="line">            &#125;</span><br><span class="line">        <span class="keyword">case</span> &lt;-n.stopCh:</span><br><span class="line">            <span class="keyword">return</span></span><br><span class="line">        &#125;</span><br><span class="line">    &#125;</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure><p>我们刚才的关键 <code>updateCh</code> 就在这里，<code>event := &lt;-n.updateCh.Out()</code>，从中获得对应的事件然后放到 <code>syncQueue</code> 里面，而 <code>syncQueue</code> 则是我们在 <code>NewNGINXController</code> 的时候初始化的</p><figure class="highlight go"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line">n.syncQueue = task.NewTaskQueue(n.syncIngress)</span><br></pre></td></tr></table></figure><p>所以最终就落到了 <code>syncIngress</code> 方法上：</p><figure class="highlight go"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br><span class="line">29</span><br><span class="line">30</span><br><span class="line">31</span><br><span class="line">32</span><br><span class="line">33</span><br><span class="line">34</span><br><span class="line">35</span><br><span class="line">36</span><br><span class="line">37</span><br><span class="line">38</span><br><span class="line">39</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment">// internal/ingress/controller/controller.go:175</span></span><br><span class="line"><span class="function"><span class="keyword">func</span> <span class="params">(n *NGINXController)</span></span> syncIngress(<span class="keyword">interface</span>&#123;&#125;) <span class="type">error</span> &#123;</span><br><span class="line">    n.syncRateLimiter.Accept()</span><br><span class="line"></span><br><span class="line">    ings := n.store.ListIngresses()</span><br><span class="line">    hosts, servers, pcfg := n.getConfiguration(ings)</span><br><span class="line"></span><br><span class="line">    <span class="keyword">if</span> n.runningConfig.Equal(pcfg) &#123;</span><br><span class="line">        klog.V(<span class="number">3</span>).Infof(<span class="string">&quot;No configuration change detected, skipping backend reload&quot;</span>)</span><br><span class="line">        <span class="keyword">return</span> <span class="literal">nil</span></span><br><span class="line">    &#125;</span><br><span class="line"></span><br><span class="line">    n.metricCollector.SetHosts(hosts)</span><br><span class="line"></span><br><span class="line">    <span class="keyword">if</span> !utilingress.IsDynamicConfigurationEnough(pcfg, n.runningConfig) &#123;</span><br><span class="line">        klog.InfoS(<span class="string">&quot;Configuration changes detected, backend reload required&quot;</span>)</span><br><span class="line"></span><br><span class="line">        hash, err := hashstructure.Hash(pcfg, hashstructure.FormatV1, &amp;hashstructure.HashOptions&#123;</span><br><span class="line">            TagName: <span class="string">&quot;json&quot;</span>,</span><br><span class="line">        &#125;)</span><br><span class="line">        <span class="keyword">if</span> err != <span class="literal">nil</span> &#123;</span><br><span class="line">            klog.Errorf(<span class="string">&quot;unexpected error hashing configuration: %v&quot;</span>, err)</span><br><span class="line">        &#125;</span><br><span class="line"></span><br><span class="line">        pcfg.ConfigurationChecksum = fmt.Sprintf(<span class="string">&quot;%v&quot;</span>, hash)</span><br><span class="line"></span><br><span class="line">        err = n.OnUpdate(*pcfg)</span><br><span class="line"></span><br><span class="line">        klog.InfoS(<span class="string">&quot;Backend successfully reloaded&quot;</span>)</span><br><span class="line">        n.metricCollector.ConfigSuccess(hash, <span class="literal">true</span>)</span><br><span class="line">        n.metricCollector.IncReloadCount()</span><br><span class="line"></span><br><span class="line">        n.recorder.Eventf(k8s.IngressPodDetails, apiv1.EventTypeNormal, <span class="string">&quot;RELOAD&quot;</span>, <span class="string">&quot;NGINX reload triggered due to a change in configuration&quot;</span>)</span><br><span class="line">    &#125;</span><br><span class="line"></span><br><span class="line">   <span class="comment">// ...</span></span><br><span class="line"></span><br><span class="line">    <span class="keyword">return</span> <span class="literal">nil</span></span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure><p>然后让我们把目光聚到 <code>err = n.OnUpdate(*pcfg)</code>（一开始我压根没注意这个部分，也是仔细寻找才发现了）</p><figure class="highlight go"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br></pre></td><td class="code"><pre><span class="line"><span class="function"><span class="keyword">func</span> <span class="params">(n *NGINXController)</span></span> OnUpdate(ingressCfg ingress.Configuration) <span class="type">error</span> &#123;</span><br><span class="line">    cfg := n.store.GetBackendConfiguration()</span><br><span class="line">    cfg.Resolver = n.resolver</span><br><span class="line"></span><br><span class="line">    <span class="comment">// ....</span></span><br><span class="line"></span><br><span class="line">    content, err := n.generateTemplate(cfg, ingressCfg)</span><br><span class="line"></span><br><span class="line">    err = n.createLuaConfig(&amp;cfg)</span><br><span class="line"></span><br><span class="line">    err = createOpentelemetryCfg(&amp;cfg)</span><br><span class="line"></span><br><span class="line">    err = n.testTemplate(content)</span><br><span class="line"></span><br><span class="line">    <span class="comment">// ....</span></span><br><span class="line"></span><br><span class="line">    err = os.WriteFile(cfgPath, content, file.ReadWriteByUser)</span><br><span class="line"></span><br><span class="line">    o, err := n.command.ExecCommand(<span class="string">&quot;-s&quot;</span>, <span class="string">&quot;reload&quot;</span>).CombinedOutput()</span><br><span class="line"></span><br><span class="line">    <span class="comment">// Reload status checking runs in a separate goroutine to avoid blocking the sync queue</span></span><br><span class="line">    <span class="keyword">if</span> workerSerialReloads &#123;</span><br><span class="line">        <span class="keyword">go</span> n.awaitWorkersReload()</span><br><span class="line">    &#125;</span><br><span class="line"></span><br><span class="line">    <span class="keyword">return</span> <span class="literal">nil</span></span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure><p>看到这里其实就很清楚了，首先就是根据 ingress 信息和模板创建对应的配置文件。而生成配置文件所需要的模板是在 <code>rootfs/etc/nginx/template/nginx.tmpl</code> 的位置，而看了模板你也就大概明白了，这不仅是 nginx 的配置吗。</p><p>然后发现，这不就是我最常用的 <code>nginx -s reload</code> 命令吗？原来就这样简单。这下，从 k8s 的资源变动到 nginx 配置变化，整个链路就能串起来了。</p><h2 id="码后解答"><a href="#码后解答" class="headerlink" title="码后解答"></a>码后解答</h2><ol><li>ingress-nginx 本身究竟是个什么东西？回答：本质其实就是个 deployment 而已，而整个就是打包了一组 k8s 的各种资源在里面</li><li>Ingress Controller 是如何监听到 Ingress 资源的变化的？回答：其实就是通过 informer 监听了对应资源变更的事件，并且注册了这些事件对应的处理方法，而 <code>SharedInformer</code> 其实是 client-go 里面的实现。</li><li>Ingress Controller 是如何将规则转换为实际的负载均衡配置的？回答：关键就是通过这些规则的配置，通过模板来生成的。</li></ol><h2 id="总结提升"><a href="#总结提升" class="headerlink" title="总结提升"></a>总结提升</h2><p>其实原理上对于 ingress 其实本身并不复杂，而更多的是，我觉得它可以作为我们学习如果制作一个 k8s 扩展组件的一个案例，如何使用 Apiserver，如何处理对应的事件，以及如何操作对应的资源等等，将它作为一个参考确实是不错的选择。</p><h3 id="编码上"><a href="#编码上" class="headerlink" title="编码上"></a>编码上</h3><h4 id="HandleSigterm"><a href="#HandleSigterm" class="headerlink" title="HandleSigterm"></a>HandleSigterm</h4><p>之前在 <code>main</code> 入口的最下面我们看到过下面的代码：</p><figure class="highlight go"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br></pre></td><td class="code"><pre><span class="line">process.HandleSigterm(ngx, conf.PostShutdownGracePeriod, <span class="function"><span class="keyword">func</span><span class="params">(code <span class="type">int</span>)</span></span> &#123;</span><br><span class="line">os.Exit(code)</span><br><span class="line">&#125;)</span><br></pre></td></tr></table></figure><p>其中的 <code>HandleSigterm</code> 的实现其实非常简单，对于一个小型项目的优雅关闭是一个非常不错的案例，可以直接拿来用。</p><figure class="highlight go"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment">// pkg/util/process/sigterm.go:32</span></span><br><span class="line"><span class="function"><span class="keyword">func</span> <span class="title">HandleSigterm</span><span class="params">(ngx Controller, delay <span class="type">int</span>, exit exiter)</span></span> &#123;</span><br><span class="line">    signalChan := <span class="built_in">make</span>(<span class="keyword">chan</span> os.Signal, <span class="number">1</span>)</span><br><span class="line">    signal.Notify(signalChan, syscall.SIGTERM)</span><br><span class="line">    &lt;-signalChan</span><br><span class="line">    klog.InfoS(<span class="string">&quot;Received SIGTERM, shutting down&quot;</span>)</span><br><span class="line"></span><br><span class="line">    exitCode := <span class="number">0</span></span><br><span class="line">    <span class="keyword">if</span> err := ngx.Stop(); err != <span class="literal">nil</span> &#123;</span><br><span class="line">        klog.Warningf(<span class="string">&quot;Error during shutdown: %v&quot;</span>, err)</span><br><span class="line">        exitCode = <span class="number">1</span></span><br><span class="line">    &#125;</span><br><span class="line"></span><br><span class="line">    klog.Infof(<span class="string">&quot;Handled quit, delaying controller exit for %d seconds&quot;</span>, delay)</span><br><span class="line">    time.Sleep(time.Duration(delay) * time.Second)</span><br><span class="line"></span><br><span class="line">    klog.InfoS(<span class="string">&quot;Exiting&quot;</span>, <span class="string">&quot;code&quot;</span>, exitCode)</span><br><span class="line">    exit(exitCode)</span><br><span class="line">&#125;</span><br><span class="line"></span><br></pre></td></tr></table></figure><h4 id="syncQueue"><a href="#syncQueue" class="headerlink" title="syncQueue"></a>syncQueue</h4><p>还有一个 <code>syncQueue</code> 其实也可以简单参考下，它是对于 <code>client-go</code> 里面的 <code>workqueue</code> 的封装，本来其实我觉得 <code>workqueue</code> 的实现已经非常好用了，特别是拿他来做泛型的使用案例特别好，所以比较值得一看，以后很大机会能用上。其中有一个取巧的封装是对于 <code>EnqueueSkippableTask</code> 方法，我一开始还在想什么是 <code>skippable</code> 可以跳过的任务，仔细一看发现其实就是任务的优先级，而当可以 <code>skippable</code> 也就是优先级低的时候，只不过将时间延迟了而已。</p><figure class="highlight go"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment">// internal/task/queue.go:74</span></span><br><span class="line"><span class="function"><span class="keyword">func</span> <span class="params">(t *Queue)</span></span> enqueue(obj <span class="keyword">interface</span>&#123;&#125;, skippable <span class="type">bool</span>) &#123;</span><br><span class="line">    <span class="keyword">if</span> t.IsShuttingDown() &#123;</span><br><span class="line">        klog.ErrorS(<span class="literal">nil</span>, <span class="string">&quot;queue has been shutdown, failed to enqueue&quot;</span>, <span class="string">&quot;key&quot;</span>, obj)</span><br><span class="line">        <span class="keyword">return</span></span><br><span class="line">    &#125;</span><br><span class="line"></span><br><span class="line">    ts := time.Now().UnixNano()</span><br><span class="line">    <span class="keyword">if</span> !skippable &#123;</span><br><span class="line">        <span class="comment">// make sure the timestamp is bigger than lastSync</span></span><br><span class="line">        ts = time.Now().Add(<span class="number">24</span> * time.Hour).UnixNano() <span class="comment">// 就只是这样而已</span></span><br><span class="line">    &#125;</span><br><span class="line">    klog.V(<span class="number">3</span>).InfoS(<span class="string">&quot;queuing&quot;</span>, <span class="string">&quot;item&quot;</span>, obj)</span><br><span class="line">    key, err := t.fn(obj)</span><br><span class="line">    <span class="keyword">if</span> err != <span class="literal">nil</span> &#123;</span><br><span class="line">        klog.ErrorS(err, <span class="string">&quot;creating object key&quot;</span>, <span class="string">&quot;item&quot;</span>, obj)</span><br><span class="line">        <span class="keyword">return</span></span><br><span class="line">    &#125;</span><br><span class="line">    t.queue.Add(Element&#123;</span><br><span class="line">        Key:       key,</span><br><span class="line">        Timestamp: ts,</span><br><span class="line">    &#125;)</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure><p>对应到实际使用上，其实就是对于外部的 event 的相应任务可以慢慢处理，而对于文件配置的变动需要立刻做出相应，优先级更高。</p>]]>
    </content>
    
    
      
      
        
        
    <summary type="html">&lt;blockquote&gt;
&lt;p&gt;📢 注意，该文本非最终版本，正在更新中，版权所有，请勿转载！！&lt;/p&gt;
&lt;/blockquote&gt;
&lt;h2 id=&quot;前言&quot;&gt;&lt;a href=&quot;#前言&quot; class=&quot;headerlink&quot;</summary>
        
      
    
    
    
    <category term="kubernetes" scheme="https://www.linkinstars.com/categories/kubernetes/"/>
    
    
    <category term="kubernetes" scheme="https://www.linkinstars.com/tags/kubernetes/"/>
    
  </entry>
  
  <entry>
    <title>向量数据库简单使用对比</title>
    <link href="https://www.linkinstars.com/post/3f5e0ed7.html"/>
    <id>https://www.linkinstars.com/post/3f5e0ed7.html</id>
    <published>2025-02-28T16:00:00.000Z</published>
    <updated>2025-03-26T09:28:17.104Z</updated>
    
    <content type="html">
    <![CDATA[<a href="https://www.linkinstars.com/post/3f5e0ed7.html">RSS 阅读体验可能不太好，若喜欢本文请点此跳转原文查看~</a><br><br>]]>
    <![CDATA[<h2 id="前言"><a href="#前言" class="headerlink" title="前言"></a>前言</h2><p>AI 带火了向量数据库，来对比几个不同的向量数据库，使用感受和差异，个人非标测试，仅供参考。</p><ul><li><a href="https://github.com/milvus-io/milvus">milvus</a></li><li><a href="https://github.com/pgvector/pgvector">pgvector</a></li><li><a href="https://github.com/qdrant/qdrant">qdrant</a></li><li><a href="https://github.com/elastic/elasticsearch">elasticsearch</a></li></ul><p>具体每个我就不多介绍了，官网都有</p><h2 id="使用场景"><a href="#使用场景" class="headerlink" title="使用场景"></a>使用场景</h2><p>正所谓不设定场景的测试和比较都是耍流氓，所以先说明一下使用场景。由于大厂的技术选型往往就直接冲着数据量、扩展性等等方向去了，而这次我们的角度不一样，更贴近与个人用户使用，准备用在个人项目中，并且直接安装在个人电脑上。</p><ul><li>本来就是单点，不需要考虑什么扩展</li><li>数据量不大，一个人撑死能有多大的数据</li><li>使尽可能减少资源占用</li></ul><h2 id="安装"><a href="#安装" class="headerlink" title="安装"></a>安装</h2><p>全部使用 docker 一把梭，使用的宿主机相同配置，都是足够的。</p><h3 id="milvus"><a href="#milvus" class="headerlink" title="milvus"></a>milvus</h3><p>其中 attu 是专门用来查看的，默认自带的功能比较少，没法直接操作</p><figure class="highlight yaml"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br><span class="line">29</span><br><span class="line">30</span><br><span class="line">31</span><br><span class="line">32</span><br><span class="line">33</span><br><span class="line">34</span><br><span class="line">35</span><br><span class="line">36</span><br><span class="line">37</span><br><span class="line">38</span><br><span class="line">39</span><br><span class="line">40</span><br><span class="line">41</span><br><span class="line">42</span><br><span class="line">43</span><br><span class="line">44</span><br><span class="line">45</span><br><span class="line">46</span><br><span class="line">47</span><br><span class="line">48</span><br><span class="line">49</span><br><span class="line">50</span><br><span class="line">51</span><br><span class="line">52</span><br><span class="line">53</span><br><span class="line">54</span><br><span class="line">55</span><br><span class="line">56</span><br><span class="line">57</span><br><span class="line">58</span><br><span class="line">59</span><br><span class="line">60</span><br><span class="line">61</span><br><span class="line">62</span><br><span class="line">63</span><br><span class="line">64</span><br><span class="line">65</span><br><span class="line">66</span><br><span class="line">67</span><br><span class="line">68</span><br></pre></td><td class="code"><pre><span class="line"><span class="attr">version:</span> <span class="string">&#x27;3.5&#x27;</span></span><br><span class="line"><span class="attr">services:</span></span><br><span class="line">  <span class="attr">etcd:</span></span><br><span class="line">    <span class="attr">container_name:</span> <span class="string">milvus-etcd</span></span><br><span class="line">    <span class="attr">image:</span> <span class="string">quay.io/coreos/etcd:v3.5.18</span></span><br><span class="line">    <span class="attr">environment:</span></span><br><span class="line">      <span class="bullet">-</span> <span class="string">ETCD_AUTO_COMPACTION_MODE=revision</span></span><br><span class="line">      <span class="bullet">-</span> <span class="string">ETCD_AUTO_COMPACTION_RETENTION=1000</span></span><br><span class="line">      <span class="bullet">-</span> <span class="string">ETCD_QUOTA_BACKEND_BYTES=4294967296</span></span><br><span class="line">      <span class="bullet">-</span> <span class="string">ETCD_SNAPSHOT_COUNT=50000</span></span><br><span class="line">    <span class="attr">command:</span> <span class="string">etcd</span> <span class="string">-advertise-client-urls=http://127.0.0.1:2379</span> <span class="string">-listen-client-urls</span> <span class="string">http://0.0.0.0:2379</span> <span class="string">--data-dir</span> <span class="string">/etcd</span></span><br><span class="line">    <span class="attr">healthcheck:</span></span><br><span class="line">      <span class="attr">test:</span> [<span class="string">&quot;CMD&quot;</span>, <span class="string">&quot;etcdctl&quot;</span>, <span class="string">&quot;endpoint&quot;</span>, <span class="string">&quot;health&quot;</span>]</span><br><span class="line">      <span class="attr">interval:</span> <span class="string">30s</span></span><br><span class="line">      <span class="attr">timeout:</span> <span class="string">20s</span></span><br><span class="line">      <span class="attr">retries:</span> <span class="number">3</span></span><br><span class="line"></span><br><span class="line">  <span class="attr">minio:</span></span><br><span class="line">    <span class="attr">container_name:</span> <span class="string">milvus-minio</span></span><br><span class="line">    <span class="attr">image:</span> <span class="string">minio/minio:RELEASE.2023-03-20T20-16-18Z</span></span><br><span class="line">    <span class="attr">environment:</span></span><br><span class="line">      <span class="attr">MINIO_ACCESS_KEY:</span> <span class="string">minioadmin</span></span><br><span class="line">      <span class="attr">MINIO_SECRET_KEY:</span> <span class="string">minioadmin</span></span><br><span class="line">    <span class="attr">ports:</span></span><br><span class="line">      <span class="bullet">-</span> <span class="string">&quot;9001:9001&quot;</span></span><br><span class="line">      <span class="bullet">-</span> <span class="string">&quot;9000:9000&quot;</span></span><br><span class="line">    <span class="attr">command:</span> <span class="string">minio</span> <span class="string">server</span> <span class="string">/minio_data</span> <span class="string">--console-address</span> <span class="string">&quot;:9001&quot;</span></span><br><span class="line">    <span class="attr">healthcheck:</span></span><br><span class="line">      <span class="attr">test:</span> [<span class="string">&quot;CMD&quot;</span>, <span class="string">&quot;curl&quot;</span>, <span class="string">&quot;-f&quot;</span>, <span class="string">&quot;http://localhost:9000/minio/health/live&quot;</span>]</span><br><span class="line">      <span class="attr">interval:</span> <span class="string">30s</span></span><br><span class="line">      <span class="attr">timeout:</span> <span class="string">20s</span></span><br><span class="line">      <span class="attr">retries:</span> <span class="number">3</span></span><br><span class="line"></span><br><span class="line">  <span class="attr">standalone:</span></span><br><span class="line">    <span class="attr">container_name:</span> <span class="string">milvus-standalone</span></span><br><span class="line">    <span class="attr">image:</span> <span class="string">milvusdb/milvus:v2.5.5</span></span><br><span class="line">    <span class="attr">command:</span> [<span class="string">&quot;milvus&quot;</span>, <span class="string">&quot;run&quot;</span>, <span class="string">&quot;standalone&quot;</span>]</span><br><span class="line">    <span class="attr">security_opt:</span></span><br><span class="line">    <span class="bullet">-</span> <span class="string">seccomp:unconfined</span></span><br><span class="line">    <span class="attr">environment:</span></span><br><span class="line">      <span class="attr">ETCD_ENDPOINTS:</span> <span class="string">etcd:2379</span></span><br><span class="line">      <span class="attr">MINIO_ADDRESS:</span> <span class="string">minio:9000</span></span><br><span class="line">    <span class="attr">healthcheck:</span></span><br><span class="line">      <span class="attr">test:</span> [<span class="string">&quot;CMD&quot;</span>, <span class="string">&quot;curl&quot;</span>, <span class="string">&quot;-f&quot;</span>, <span class="string">&quot;http://localhost:9091/healthz&quot;</span>]</span><br><span class="line">      <span class="attr">interval:</span> <span class="string">30s</span></span><br><span class="line">      <span class="attr">start_period:</span> <span class="string">90s</span></span><br><span class="line">      <span class="attr">timeout:</span> <span class="string">20s</span></span><br><span class="line">      <span class="attr">retries:</span> <span class="number">3</span></span><br><span class="line">    <span class="attr">ports:</span></span><br><span class="line">      <span class="bullet">-</span> <span class="string">&quot;19530:19530&quot;</span></span><br><span class="line">      <span class="bullet">-</span> <span class="string">&quot;9091:9091&quot;</span></span><br><span class="line">    <span class="attr">depends_on:</span></span><br><span class="line">      <span class="bullet">-</span> <span class="string">&quot;etcd&quot;</span></span><br><span class="line">      <span class="bullet">-</span> <span class="string">&quot;minio&quot;</span></span><br><span class="line"></span><br><span class="line">  <span class="attr">attu:</span></span><br><span class="line">    <span class="attr">container_name:</span> <span class="string">milvus-attu</span></span><br><span class="line">    <span class="attr">image:</span> <span class="string">zilliz/attu:v2.4</span></span><br><span class="line">    <span class="attr">environment:</span></span><br><span class="line">      <span class="attr">MILVUS_URL:</span> <span class="string">milvus:19530</span></span><br><span class="line">    <span class="attr">ports:</span></span><br><span class="line">      <span class="bullet">-</span> <span class="string">&quot;7000:3000&quot;</span></span><br><span class="line">    <span class="attr">networks:</span></span><br><span class="line">      <span class="bullet">-</span> <span class="string">default</span></span><br><span class="line"></span><br><span class="line"><span class="attr">networks:</span></span><br><span class="line">  <span class="attr">default:</span></span><br><span class="line">    <span class="attr">name:</span> <span class="string">milvus</span></span><br></pre></td></tr></table></figure><h3 id="pgvector"><a href="#pgvector" class="headerlink" title="pgvector"></a>pgvector</h3><p>pgvector 是 postgresql 的一个插件，所以安装之后需要执行命令去启用</p><figure class="highlight bash"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line">$ docker run --name postgresql -e POSTGRES_USER=postgres -e POSTGRES_PASSWORD=password -p 5432:5432 -d pgvector/pgvector:pg17</span><br></pre></td></tr></table></figure><figure class="highlight bash"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br></pre></td><td class="code"><pre><span class="line">$ docker <span class="built_in">exec</span> -it postgresql psql -h 127.0.0.1 -p 5432 -U postgres</span><br><span class="line">psql (17.4 (Debian 17.4-1.pgdg120+2))</span><br><span class="line">Type <span class="string">&quot;help&quot;</span> <span class="keyword">for</span> <span class="built_in">help</span>.</span><br></pre></td></tr></table></figure><figure class="highlight bash"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br></pre></td><td class="code"><pre><span class="line">$ postgres=# <span class="keyword">select</span> * from pg_extension;</span><br><span class="line">  oid  | extname | extowner | extnamespace | extrelocatable | extversion | extconfig | extcondition</span><br><span class="line">-------+---------+----------+--------------+----------------+------------+-----------+--------------</span><br><span class="line"> 13569 | plpgsql |       10 |           11 | f              | 1.0        |           |</span><br><span class="line">(1 row)</span><br></pre></td></tr></table></figure><figure class="highlight bash"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br></pre></td><td class="code"><pre><span class="line">$ postgres=# CREATE EXTENSION vector;</span><br><span class="line">CREATE EXTENSION</span><br><span class="line"></span><br><span class="line">$ postgres=# <span class="keyword">select</span> * from pg_extension;</span><br><span class="line">  oid  | extname | extowner | extnamespace | extrelocatable | extversion | extconfig | extcondition</span><br><span class="line">-------+---------+----------+--------------+----------------+------------+-----------+--------------</span><br><span class="line"> 13569 | plpgsql |       10 |           11 | f              | 1.0        |           |</span><br><span class="line"> 16388 | vector  |       10 |         2200 | t              | 0.8.0      |           |</span><br><span class="line">(2 rows)</span><br></pre></td></tr></table></figure><h3 id="qdrant"><a href="#qdrant" class="headerlink" title="qdrant"></a>qdrant</h3><figure class="highlight bash"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line">docker run -p 6333:6333 -p 6334:6334 qdrant/qdrant</span><br></pre></td></tr></table></figure><h3 id="elasticsearch"><a href="#elasticsearch" class="headerlink" title="elasticsearch"></a>elasticsearch</h3><figure class="highlight bash"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br></pre></td><td class="code"><pre><span class="line">docker run -d \</span><br><span class="line">--name elasticsearch \</span><br><span class="line">    -e <span class="string">&quot;ES_JAVA_OPTS=-Xms2048m -Xmx2048m&quot;</span> \</span><br><span class="line">    -e <span class="string">&quot;discovery.type=single-node&quot;</span> \</span><br><span class="line">    --privileged \</span><br><span class="line">    -p 9200:9200 \</span><br><span class="line">    -p 9300:9300 \</span><br><span class="line">elasticsearch:7.17.7</span><br></pre></td></tr></table></figure><h2 id="测试"><a href="#测试" class="headerlink" title="测试"></a>测试</h2><h3 id="前提"><a href="#前提" class="headerlink" title="前提"></a>前提</h3><ul><li>相同的向量数据，是同一个模型 <code>embedding</code> 生成<strong>相同的向量数据</strong>，维度 1024</li><li>相同的索引算法， <strong>HNSW (Hierarchical Navigable Small World)</strong></li><li>相同的计算方法，<strong>COSINE</strong> 余弦相似度</li></ul><h3 id="查询速度"><a href="#查询速度" class="headerlink" title="查询速度"></a>查询速度</h3><p>小数据量下区别不大，大数据量下这个差距会扩大</p><ul><li>milvus 执行时间：452.41ms</li><li>pgvector 执行时间：180.65ms</li><li>qdrant 执行时间：69.00ms</li><li>elasticsearch 执行时间：314.35ms</li></ul><h3 id="CPU"><a href="#CPU" class="headerlink" title="CPU"></a>CPU</h3><p>pgvector 的 CPU 波动最不明显，qdrant 和 elasticsearch 相差不大，而 milvus 显然 CPU 敏感，测试多次效果基本一致。</p><p><img src="https://blog.linkinstars.com/blog/vectordb-test-cpu-usage.png" alt="vectordb-test-cpu-usage"></p><h3 id="内存"><a href="#内存" class="headerlink" title="内存"></a>内存</h3><p><img src="https://blog.linkinstars.com/blog/vectordb-test-memory-usage.png" alt="vectordb-test-cpu-usage"></p><p>从内存的角度上来说，很明显 elasticsearch 和 milvus 占用更多，而 qdrant 和 pgvector 从图片中很难看出区别，但实际数据值可以看到一个是 172MiB(pgvector) 一个是 152.9MiB(qdrant) 也区别不大。</p><figure class="highlight plaintext"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br></pre></td><td class="code"><pre><span class="line">CONTAINER ID   NAME                CPU %     MEM USAGE / LIMIT     MEM %     NET I/O           BLOCK I/O         PIDS</span><br><span class="line"></span><br><span class="line">4c3b1922b36b   elasticsearch       0.46%     1.619GiB / 5.126GiB   31.59%    378MB / 31.9MB    201MB / 7.75GB    100</span><br><span class="line"></span><br><span class="line">1c0febeff728   postgresql          0.00%     172MiB / 5.126GiB     3.28%     1.28GB / 125MB    342MB / 6.08GB    12</span><br><span class="line"></span><br><span class="line">58fa2a1d8558   qdrant              1.37%     152.9MiB / 5.126GiB   2.91%     648MB / 9.41MB    291MB / 8.23GB    52</span><br><span class="line"></span><br><span class="line">bb05db1d3099   milvus-standalone   6.92%     368.5MiB / 5.126GiB   7.02%     18.9GB / 14GB     776MB / 5.62GB    88</span><br><span class="line">b0c0ff71478e   milvus-attu         0.36%     28.27MiB / 5.126GiB   0.54%     9.66MB / 27.3MB   41.7MB / 96.6MB   23</span><br><span class="line">edad780e6197   milvus-minio        14.68%    133.1MiB / 5.126GiB   2.54%     13.8GB / 18.1GB   18.5GB / 17.5GB   27</span><br><span class="line">fc8e6e0012ae   milvus-etcd         0.53%     30.41MiB / 5.126GiB   0.58%     156MB / 118MB     67.9MB / 12.3GB   15</span><br></pre></td></tr></table></figure><h3 id="使用"><a href="#使用" class="headerlink" title="使用"></a>使用</h3><p>我使用的 go 连接的每个数据库，具体代码就不展示了，每个仓库的使用方式的各有不同，但好在 sdk 都完整所以直接将官方的案例拷贝过来都能用。说几个实际中遇到的问题。</p><h4 id="milvus-1"><a href="#milvus-1" class="headerlink" title="milvus"></a>milvus</h4><p>相比于其他几个，它的使用需要额外在查询之前 <code>LoadCollection</code> ，需要手动 load 到内存里面去。然后如果使用完之后不用了，可以关掉。</p><p>解析结果的时候一定注意查询结果还在内侧，而非最外面</p><figure class="highlight plaintext"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br></pre></td><td class="code"><pre><span class="line">searchResult, err := m.client.Search(</span><br><span class="line"></span><br><span class="line">// 需要两个循环</span><br><span class="line">for i := 0; i &lt; len(searchResult); i++ &#123;</span><br><span class="line">for j := 0; j &lt; searchResult[i].ResultCount; j++ &#123;</span><br></pre></td></tr></table></figure><p>别问我怎么知道的，说多了也难受，不知道为什么要这样设计…</p><h4 id="pgvector-1"><a href="#pgvector-1" class="headerlink" title="pgvector"></a>pgvector</h4><p>如果没有使用 orm 的话需要手写 SQL 相较于其他的 API 使用会有一点点门槛，比如：</p><figure class="highlight sql"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br></pre></td><td class="code"><pre><span class="line"><span class="keyword">SELECT</span> title, <span class="number">1</span> <span class="operator">-</span> (embedding <span class="operator">&lt;=&gt;</span> $<span class="number">1</span>) <span class="keyword">as</span> similarity</span><br><span class="line"><span class="keyword">FROM</span> embeddings</span><br><span class="line"><span class="keyword">ORDER</span> <span class="keyword">BY</span> embedding <span class="operator">&lt;=&gt;</span> $<span class="number">1</span></span><br><span class="line">LIMIT $<span class="number">2</span></span><br></pre></td></tr></table></figure><h4 id="其他"><a href="#其他" class="headerlink" title="其他"></a>其他</h4><p>一定注意使用的<strong>索引算法</strong>和<strong>向量的维度</strong>，有的时候创建的维度和插入数据维度不一样也不会异常报错的，需要特别注意。特别是在测试不同模型下生成不同维度向量数据效果的时候。</p><h3 id="查询结果的说明"><a href="#查询结果的说明" class="headerlink" title="查询结果的说明"></a>查询结果的说明</h3><blockquote><p>从理论上来说在使用相同算法的情况下，无论哪一个数据库查询结果应该都是相同的</p></blockquote><p>实际测试下来发现</p><ul><li>milvus</li><li>qdrant</li><li>elasticsearch</li></ul><p>相同的查询条件下，以上三者的查询结果相同，而 pgvector，在一些查询条件下结果与其他会有一两个差异，相似度越高差异越小，比如 top5 就没差异，top10 可能就相差一个。这是由于 pgvector 实现的算法逻辑应该是近似的，而非 100%，不过不影响实际使用，后面会有说明。</p><h2 id="总结"><a href="#总结" class="headerlink" title="总结"></a>总结</h2><ol><li>注意，搜索结果与使用数据库无关，仅与算法有关。</li><li>由于有索引存在，不同的向量数据库在个人使用的场景下均能满足对于查询速度的要求。</li><li>插入，虽然这里的测试并没有提到插入数据，因为实际的使用中对于插入数据来说并不敏感。但由于测试的时候插入的数据量巨大，导致明显 qdrant 遥遥领先于其他插入的时间，对于测试会友好很多。</li></ol><p>个人总结：</p><p>在其他条件类似的情况下，显然使用不同底层语言实现的，计算和内存资源的使用上 Java(elasticsearch) &gt; golang(milvus) &gt; rust(qdrant) ≈ c(pgvector) 感受太明显了，所以那么多人才会用 rust 去重写各种中间件。没办法，底层就直接决定了能力所不同。</p><p>单一向量方式的的相似度对于搜索来说有效，但还不够，一方面取决与向量化模型本身，另一方面用过的都知道，<strong>仅用 COSINE 余弦相似度去比较，最终的查询结果没有那么的符合人的直觉</strong>。所以才会有那么多的搜索系统才有召回、粗排、精排、匹配。所以，单一能力并不能足以胜任工作，如果做的更<strong>准</strong>，还有很多工作要做的。当然，对于个人使用者来说，就我的使用感受来说，显然比一些正常的分词后做匹配要来的更好用。</p>]]>
    </content>
    
    
      
      
        
        
    <summary type="html">&lt;h2 id=&quot;前言&quot;&gt;&lt;a href=&quot;#前言&quot; class=&quot;headerlink&quot; title=&quot;前言&quot;&gt;&lt;/a&gt;前言&lt;/h2&gt;&lt;p&gt;AI 带火了向量数据库，来对比几个不同的向量数据库，使用感受和差异，个人非标测试，仅供参考。&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a</summary>
        
      
    
    
    
    <category term="vectordb" scheme="https://www.linkinstars.com/categories/vectordb/"/>
    
    
    <category term="milvus" scheme="https://www.linkinstars.com/tags/milvus/"/>
    
    <category term="pgvector" scheme="https://www.linkinstars.com/tags/pgvector/"/>
    
    <category term="qdrant" scheme="https://www.linkinstars.com/tags/qdrant/"/>
    
    <category term="elasticsearch" scheme="https://www.linkinstars.com/tags/elasticsearch/"/>
    
  </entry>
  
  <entry>
    <title>博客装修(2025年3月)</title>
    <link href="https://www.linkinstars.com/post/d907bfb4.html"/>
    <id>https://www.linkinstars.com/post/d907bfb4.html</id>
    <published>2025-02-27T16:00:00.000Z</published>
    <updated>2025-03-03T10:45:35.598Z</updated>
    
    <content type="html">
    <![CDATA[<a href="https://www.linkinstars.com/post/d907bfb4.html">RSS 阅读体验可能不太好，若喜欢本文请点此跳转原文查看~</a><br><br>]]>
    <![CDATA[<blockquote><p>春风拂柳燕归来，细雨轻敲绿满台。桃李争艳芳草盛，人间三月尽芳怀。<br>时间还是真的快啊，但这次几乎没有什么大的变动，好像有点退烧了，算是优化到了一个自己认为非常不错的阶段。</p></blockquote><h2 id="butterfly-主题升级"><a href="#butterfly-主题升级" class="headerlink" title="butterfly 主题升级"></a>butterfly 主题升级</h2><blockquote><p>更新主题版本至 5.3.4 <a href="https://github.com/jerryc127/hexo-theme-butterfly">https://github.com/jerryc127/hexo-theme-butterfly</a></p></blockquote><p>这次跨度比较大，直接从 4.x 到了 5.x ，然后所以的配置文件安装更新的要求全部都要重新设置过了，所以这次过段等上了一段时间才进行了更新，从中将原有的相关部分做了替换。</p><p>改动最大的莫过于原来的自定义页面部分了，特别是主页部分的改动，由于之前的 pug 文件已经不再适用，所以我直接将原有的 pug 文件全部替换成了新的 pug 文件。</p><figure class="highlight plaintext"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br></pre></td><td class="code"><pre><span class="line">extends includes/layout.pug</span><br><span class="line"></span><br><span class="line">block content</span><br><span class="line">  include ./includes/mixins/post-ui.pug</span><br><span class="line">  #recent-posts.recent-posts</span><br><span class="line">    include includes/categoryGroup.pug</span><br><span class="line">    include includes/categoryBar.pug</span><br><span class="line">    +postUI</span><br><span class="line">    include includes/pagination.pug</span><br></pre></td></tr></table></figure><h2 id="评论系统"><a href="#评论系统" class="headerlink" title="评论系统"></a>评论系统</h2><p>评论组件依旧是自建的 <a href="https://twikoo.js.org/intro.html">twikoo</a> 很喜欢，配置简单。今年我暂时没有升级版本和换评论系统，因为我觉得现在的评论系统已经很好用了，而且我也没有什么特别的需求。</p><h2 id="Notion-Paper"><a href="#Notion-Paper" class="headerlink" title="Notion Paper"></a>Notion Paper</h2><blockquote><p>为什么叫 Notion Paper?</p><p>相比 idea, thought 等，notion 指的是一种模糊的，变化的莫测的想法，无可靠的基础的，未经深思熟虑的观点。这里汇集了一些好的 news，但不是一份 newspaper，而包含了我那些不成熟的 notion，所以我称它为 Notion Paper 。</p></blockquote><p><a href="https://www.linkinstars.com/post/22fd52ad.html">Notion Paper</a> 这部分保持着一定的更新频率，算是我对于博客部分的一种补充，也是去年新加入的一个板块，改动最大的一个部分。今年也将继续保持更新。</p><h2 id="专栏"><a href="#专栏" class="headerlink" title="专栏"></a>专栏</h2><p>其中一个专栏还在不断更新（差点断了线），另一个因为一直找不到合适的素材，所以一直没有更新，这个部分我会继续努力。这部分确实就和其他大神的作者一样，越到后面越难写，知道的越多不知道的就越多这句话是真的，现在让我重新去写网络相关的东西，我可能也会有点手足无措。</p><h2 id="公众号"><a href="#公众号" class="headerlink" title="公众号"></a>公众号</h2><p>这部分我一直没有更新，发现了一个问题就是阅读的数量其实并不多，所以我也没有继续更新，这部分我会继续观察，如果有好的内容我会继续更新。</p><h2 id="总结"><a href="#总结" class="headerlink" title="总结"></a>总结</h2><p>想趁着这次机会说说 AI 来了，Blog 还能写吗？答案是一定的。如果是个人的 Blog，那么它的价值就在于记录，记录的是你的生活，你的思考，你的感悟，你的成长。AI 可以帮你写文章，但它写不出<strong>你的文章</strong>。如果单纯是为了 SEO，为了流量，广告去写，那么用 AI 产出内容就一定会“水到渠成”，毕竟 AI 的产出成本远远低于人类。</p><p>所以，个人的 Blog 还能写，而且是越来越有价值的。这是我对于 Blog 的看法，也是我一直坚持写 Blog 的原因。</p><p>这也是我最近看到这篇文章给自己的一个思考，也分享给你: <a href="https://www.gilesthomas.com/2025/02/blogging-in-the-age-of-ai">https://www.gilesthomas.com/2025/02/blogging-in-the-age-of-ai</a></p>]]>
    </content>
    
    
      
      
        
        
    <summary type="html">&lt;blockquote&gt;
&lt;p&gt;春风拂柳燕归来，细雨轻敲绿满台。桃李争艳芳草盛，人间三月尽芳怀。&lt;br&gt;时间还是真的快啊，但这次几乎没有什么大的变动，好像有点退烧了，算是优化到了一个自己认为非常不错的阶段。&lt;/p&gt;
&lt;/blockquote&gt;
&lt;h2</summary>
        
      
    
    
    
    <category term="博客装修" scheme="https://www.linkinstars.com/categories/%E5%8D%9A%E5%AE%A2%E8%A3%85%E4%BF%AE/"/>
    
    
    <category term="hexo主题魔改" scheme="https://www.linkinstars.com/tags/hexo%E4%B8%BB%E9%A2%98%E9%AD%94%E6%94%B9/"/>
    
  </entry>
  
  <entry>
    <title>《一起读 kubernetes 源码》kube-proxy 默默做了什么？</title>
    <link href="https://www.linkinstars.com/post/e38334a9.html"/>
    <id>https://www.linkinstars.com/post/e38334a9.html</id>
    <published>2025-01-14T16:00:00.000Z</published>
    <updated>2025-01-16T08:07:51.569Z</updated>
    
    <content type="html">
    <![CDATA[<a href="https://www.linkinstars.com/post/e38334a9.html">RSS 阅读体验可能不太好，若喜欢本文请点此跳转原文查看~</a><br><br>]]>
    <![CDATA[<blockquote><p>📢 注意，该文本非最终版本，正在更新中，版权所有，请勿转载！！</p></blockquote><h2 id="前言"><a href="#前言" class="headerlink" title="前言"></a>前言</h2><p>前一节我们了解了 Service 和 Endpoint 的创建，发现他们本质上还是在改 k8s 内部的配置(数据)，并没有实际去干活，<strong>也就是最终没有反应在机器的网络配置上</strong>。我们知道，想要让你的 pod 可以被 <code>ClusterIP</code> 访问或者是通过外部访问，都需要对于网络设备做一些规则的调整，也就是我们常说的 <code>iptables</code> 或 <code>ipvs</code>。而 <code>kube-proxy</code> 组件就是来做这个工作的，它会将所需要的路由规则最终配置好。那么它究竟是如何做的呢？</p><h2 id="前置知识"><a href="#前置知识" class="headerlink" title="前置知识"></a>前置知识</h2><ul><li>ipvs、iptables 的基础</li></ul><h2 id="码前提问"><a href="#码前提问" class="headerlink" title="码前提问"></a>码前提问</h2><ol><li><code>kube-proxy</code> 是什么类型的对象？</li><li><code>ipvs</code> 是由谁来配的？</li><li><code>ipvs</code> 后续如何更新？</li></ol><h2 id="源码分析"><a href="#源码分析" class="headerlink" title="源码分析"></a>源码分析</h2><p>首先我们需要找到 <code>kube-proxy</code> 在哪。和之前不一样的是，之前我们看到都是一个个对象，而这次它是一个<strong>独立的组件</strong>。如果你之前没看过，也没了解过，<strong>你可以先思考一个问题 <code>kube-proxy</code> 应该是一个什么类型的对象呢</strong>？deployment 还是 statefulset 呢？最后我们再来验证一下。</p><h3 id="启动"><a href="#启动" class="headerlink" title="启动"></a>启动</h3><h4 id="启动命令"><a href="#启动命令" class="headerlink" title="启动命令"></a>启动命令</h4><p>一个独立的组件，其实也就是一个二进制，运行的一个程序而已，从 yaml 中启动的命令我们可以看到，参数是一个配置文件，还有一个 node 名称。</p><figure class="highlight yaml"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br></pre></td><td class="code"><pre><span class="line"><span class="attr">command:</span></span><br><span class="line">    <span class="bullet">-</span> <span class="string">/usr/local/bin/kube-proxy</span></span><br><span class="line">    <span class="bullet">-</span> <span class="string">&#x27;--config=/var/lib/kube-proxy/config.conf&#x27;</span></span><br><span class="line">    <span class="bullet">-</span> <span class="string">&#x27;--hostname-override=$(NODE_NAME)&#x27;</span></span><br></pre></td></tr></table></figure><h4 id="入口"><a href="#入口" class="headerlink" title="入口"></a>入口</h4><p>这样的组件我们非常容易通过 cmd 找到入口 <code>kubernetes/cmd/kube-proxy/proxy.go</code>。之前没有提到过，k8s 中是使用 <code>cobra</code> 来实现命令行的功能的，非常好用的一个库。然后 <code>kube-proxy</code> 就一个命令直接 run。</p><p>在 <code>cmd/kube-proxy/app/server.go:501</code> 很容易顺着路径找到。<code>NewProxyCommand</code> -&gt; <code>opts.Run()</code> -&gt; <code>newProxyServer</code> -&gt; <code>createProxier</code></p><figure class="highlight go"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment">// cmd/kube-proxy/app/server.go:359</span></span><br><span class="line"><span class="function"><span class="keyword">func</span> <span class="params">(o *Options)</span></span> Run() <span class="type">error</span> &#123;</span><br><span class="line">    <span class="keyword">defer</span> <span class="built_in">close</span>(o.errCh)</span><br><span class="line">    <span class="keyword">if</span> <span class="built_in">len</span>(o.WriteConfigTo) &gt; <span class="number">0</span> &#123;</span><br><span class="line">        <span class="keyword">return</span> o.writeConfigFile()</span><br><span class="line">    &#125;</span><br><span class="line"></span><br><span class="line">    <span class="keyword">if</span> o.CleanupAndExit &#123;</span><br><span class="line">        <span class="keyword">return</span> cleanupAndExit()</span><br><span class="line">    &#125;</span><br><span class="line"></span><br><span class="line">    proxyServer, err := newProxyServer(o.config, o.master)</span><br><span class="line">    <span class="keyword">if</span> err != <span class="literal">nil</span> &#123;</span><br><span class="line">        <span class="keyword">return</span> err</span><br><span class="line">    &#125;</span><br><span class="line"></span><br><span class="line">    o.proxyServer = proxyServer</span><br><span class="line">    <span class="keyword">return</span> o.runLoop()</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure><p>记住这里的，等下还要回来的，其实就是 <code>newProxyServerr</code> 然后 <code>runLoop</code> ，熟悉的感觉。</p><h3 id="Proxier"><a href="#Proxier" class="headerlink" title="Proxier"></a>Proxier</h3><p>然后是第一个要点，运行的模式。</p><figure class="highlight go"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment">// cmd/kube-proxy/app/server_others.go:128</span></span><br><span class="line"><span class="function"><span class="keyword">func</span> <span class="params">(s *ProxyServer)</span></span> createProxier(config *proxyconfigapi.KubeProxyConfiguration, dualStack <span class="type">bool</span>) (proxy.Provider, <span class="type">error</span>) &#123;</span><br><span class="line">    <span class="keyword">var</span> proxier proxy.Provider</span><br><span class="line">    <span class="keyword">var</span> err <span class="type">error</span></span><br><span class="line"></span><br><span class="line">    <span class="comment">// ....</span></span><br><span class="line">    <span class="keyword">if</span> config.Mode == proxyconfigapi.ProxyModeIPTables &#123;</span><br><span class="line">        klog.InfoS(<span class="string">&quot;Using iptables Proxier&quot;</span>)</span><br><span class="line">        <span class="comment">// ....</span></span><br><span class="line">    &#125; <span class="keyword">else</span> <span class="keyword">if</span> config.Mode == proxyconfigapi.ProxyModeIPVS &#123;</span><br><span class="line">        <span class="comment">// ....</span></span><br><span class="line">    &#125;</span><br><span class="line">    <span class="keyword">return</span> proxier, <span class="literal">nil</span></span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure><figure class="highlight go"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br></pre></td><td class="code"><pre><span class="line"><span class="keyword">const</span> (</span><br><span class="line">ProxyModeIPTables    ProxyMode = <span class="string">&quot;iptables&quot;</span></span><br><span class="line">ProxyModeIPVS        ProxyMode = <span class="string">&quot;ipvs&quot;</span></span><br><span class="line">ProxyModeKernelspace ProxyMode = <span class="string">&quot;kernelspace&quot;</span></span><br><span class="line">)</span><br></pre></td></tr></table></figure><p><code>iptables</code> 和 <code>ipvs</code> 也就是 proxy 的模式，也就是 <code>kube-proxy</code> 最重要的实现了。</p><blockquote><p>这里我们其实可以看到里面 <code>NewProxier</code> 这样类似的方法参数是非常多的，而且并没有封装成对象，可以看到观感是并不好的，而且如果改动起来还是比较麻烦的，你必须仔细核对每个参数的顺序。如果是实际中，我是不会建议这样写的。</p></blockquote><p>我们知道 iptables 的实现是存在性能问题的，相较于它我们更关注 ipvs 的实现，而对于 <code>NewProxier</code> 里面的参数太多这里不一一说明。看一个小点。</p><figure class="highlight go"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment">// pkg/proxy/ipvs/proxier.go:309</span></span><br><span class="line"><span class="function"><span class="keyword">func</span> <span class="title">NewProxier</span><span class="params">(ipFamily v1.IPFamily,</span></span></span><br><span class="line"><span class="params"><span class="function">    // ...</span></span></span><br><span class="line"><span class="params"><span class="function">)</span></span> (*Proxier, <span class="type">error</span>) &#123;</span><br><span class="line">    <span class="comment">// ...</span></span><br><span class="line"></span><br><span class="line">    proxier := &amp;Proxier&#123;</span><br><span class="line">        <span class="comment">// ...</span></span><br><span class="line">    &#125;</span><br><span class="line">    <span class="comment">// ...</span></span><br><span class="line">    proxier.syncRunner = async.NewBoundedFrequencyRunner(<span class="string">&quot;sync-runner&quot;</span>, proxier.syncProxyRules, minSyncPeriod, syncPeriod, burstSyncs)</span><br><span class="line">    proxier.gracefuldeleteManager.Run()</span><br><span class="line">    <span class="keyword">return</span> proxier, <span class="literal">nil</span></span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure><p>记住这里的 <code>syncRunner</code> 是一个 <code>NewBoundedFrequencyRunner</code> 类型，而其具体里面内部调用的方法其实是 <code>proxier.syncProxyRules</code></p><h3 id="Run"><a href="#Run" class="headerlink" title="Run"></a>Run</h3><p>然后创建完，就是启动了，Run 里面的启动，关键就在下面这部分我精简了一下：</p><figure class="highlight go"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment">// cmd/kube-proxy/app/server.go:844</span></span><br><span class="line"><span class="function"><span class="keyword">func</span> <span class="params">(s *ProxyServer)</span></span> Run() <span class="type">error</span> &#123;</span><br><span class="line">    <span class="comment">// ...</span></span><br><span class="line"></span><br><span class="line">    <span class="comment">// Create configs (i.e. Watches for Services and EndpointSlices)</span></span><br><span class="line">    <span class="comment">// Note: RegisterHandler() calls need to happen before creation of Sources because sources</span></span><br><span class="line">    <span class="comment">// only notify on changes, and the initial update (on process start) may be lost if no handlers</span></span><br><span class="line">    <span class="comment">// are registered yet.</span></span><br><span class="line">    serviceConfig := config.NewServiceConfig(informerFactory.Core().V1().Services(), s.Config.ConfigSyncPeriod.Duration)</span><br><span class="line">    serviceConfig.RegisterEventHandler(s.Proxier)</span><br><span class="line">    <span class="keyword">go</span> serviceConfig.Run(wait.NeverStop)</span><br><span class="line"></span><br><span class="line">    endpointSliceConfig := config.NewEndpointSliceConfig(informerFactory.Discovery().V1().EndpointSlices(), s.Config.ConfigSyncPeriod.Duration)</span><br><span class="line">    endpointSliceConfig.RegisterEventHandler(s.Proxier)</span><br><span class="line">    <span class="keyword">go</span> endpointSliceConfig.Run(wait.NeverStop)</span><br><span class="line"></span><br><span class="line">    <span class="comment">// ...</span></span><br><span class="line">    nodeConfig.RegisterEventHandler(s.Proxier)</span><br><span class="line"></span><br><span class="line">    <span class="keyword">go</span> nodeConfig.Run(wait.NeverStop)</span><br><span class="line"></span><br><span class="line">    <span class="comment">// Birth Cry after the birth is successful</span></span><br><span class="line">    s.birthCry()</span><br><span class="line"></span><br><span class="line">    <span class="keyword">go</span> s.Proxier.SyncLoop()</span><br><span class="line"></span><br><span class="line">    <span class="keyword">return</span> &lt;-errCh</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure><p>可以看到三个</p><ul><li><code>serviceConfig.RegisterEventHandler(s.Proxier)</code></li><li><code>endpointSliceConfig.RegisterEventHandler(s.Proxier)</code></li><li><code>nodeConfig.RegisterEventHandler(s.Proxier)</code></li></ul><p>分别对应与三种不通类比的事件，service、endpoint、node。看到这里，我相信你应该也已经明白了，至少当这些资源出现变化的时候，显然 <code>kube-proxy</code> 也许需要做相应的处理。</p><p>不仅如此，让我们关注到最后的 <code>go s.Proxier.SyncLoop()</code> 也就是说，它自己本身也有一个同步循环在的。而上面三个，最后也落在来循环上，有兴趣你可以进去继续看看。</p><blockquote><p>注意此时开始，我们不能通过简单的点击来查看源码了，因为这里涉及了两个看源码的容易迷糊的点。</p><ol><li>Proxier 有不同的实现，对此我们直接来到 <code>pkg/proxy/ipvs/proxier.go</code> 目录下看即可，不需要点，直接看具体方法实现</li><li>在 SyncLoop 里面其实是调用了 <code>syncRunner.Loop</code> 而如果你直接点击，啪，晕了，因为没有具体实现，而这就是我们前面提到的 <code>NewBoundedFrequencyRunner</code>，调用的方法其实是初始化的时候就传递进去了，也就是 内部调用的方法其实是 <code>proxier.syncProxyRules</code></li></ol></blockquote><h3 id="syncProxyRules"><a href="#syncProxyRules" class="headerlink" title="syncProxyRules"></a>syncProxyRules</h3><p>重点来了，本文的重点是它 <code>syncProxyRules</code>。</p><figure class="highlight go"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment">// This is where all of the ipvs calls happen.</span></span><br><span class="line"><span class="function"><span class="keyword">func</span> <span class="params">(proxier *Proxier)</span></span> syncProxyRules()</span><br></pre></td></tr></table></figure><p>注释写的非常清楚，这就是 <code>ipvs</code> 干的全部的事情了，都在里面 600 行左右的代码。当然，如果不是为了研究 <code>ipvs</code> 本身，我觉得你没必要仔细研究里面内部的配置。</p><blockquote><p>无论是 <code>iptables</code> 和 <code>ipvs</code> 其本质是什么？<strong>路由规则</strong>，或者说路由表。它就是需要知道一个 ip 的路由规则是什么，应该怎么走。围绕这一点，你就可以大致了解到整个方法的内容了。</p></blockquote><figure class="highlight go"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br><span class="line">29</span><br></pre></td><td class="code"><pre><span class="line"><span class="function"><span class="keyword">func</span> <span class="params">(proxier *Proxier)</span></span> syncProxyRules() &#123;</span><br><span class="line">    serviceUpdateResult := proxier.svcPortMap.Update(proxier.serviceChanges)</span><br><span class="line">    endpointUpdateResult := proxier.endpointsMap.Update(proxier.endpointsChanges)</span><br><span class="line"></span><br><span class="line">    <span class="comment">// Build IPVS rules for each service.</span></span><br><span class="line">    <span class="keyword">for</span> svcPortName, svcPort := <span class="keyword">range</span> proxier.svcPortMap &#123;</span><br><span class="line">        <span class="comment">// ...</span></span><br><span class="line">    &#125;</span><br><span class="line"></span><br><span class="line">    proxier.writeIptablesRules()</span><br><span class="line"></span><br><span class="line">    proxier.iptablesData.Reset()</span><br><span class="line">    proxier.iptablesData.Write(proxier.natChains.Bytes())</span><br><span class="line">    proxier.iptablesData.Write(proxier.natRules.Bytes())</span><br><span class="line">    proxier.iptablesData.Write(proxier.filterChains.Bytes())</span><br><span class="line">    proxier.iptablesData.Write(proxier.filterRules.Bytes())</span><br><span class="line"></span><br><span class="line">    err = proxier.iptables.RestoreAll(proxier.iptablesData.Bytes(), utiliptables.NoFlushTables, utiliptables.RestoreCounters)</span><br><span class="line"></span><br><span class="line">    <span class="keyword">if</span> err := proxier.serviceHealthServer.SyncServices(proxier.svcPortMap.HealthCheckNodePorts()); err != <span class="literal">nil</span> &#123;</span><br><span class="line">        klog.ErrorS(err, <span class="string">&quot;Error syncing healthcheck services&quot;</span>)</span><br><span class="line">    &#125;</span><br><span class="line">    <span class="keyword">if</span> err := proxier.serviceHealthServer.SyncEndpoints(proxier.endpointsMap.LocalReadyEndpoints()); err != <span class="literal">nil</span> &#123;</span><br><span class="line">        klog.ErrorS(err, <span class="string">&quot;Error syncing healthcheck endpoints&quot;</span>)</span><br><span class="line">    &#125;</span><br><span class="line"></span><br><span class="line">    metrics.SyncProxyRulesNoLocalEndpointsTotal.WithLabelValues(<span class="string">&quot;internal&quot;</span>).Set(<span class="type">float64</span>(proxier.serviceNoLocalEndpointsInternal.Len()))</span><br><span class="line">    metrics.SyncProxyRulesNoLocalEndpointsTotal.WithLabelValues(<span class="string">&quot;external&quot;</span>).Set(<span class="type">float64</span>(proxier.serviceNoLocalEndpointsExternal.Len()))</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure><p>没错，精简之后就是这部分。其中最为关键的就是 <code>for svcPortName, svcPort := range proxier.svcPortMap &#123;</code> 这个循环了，里面将所有需要 ipvs 的规则找出来了。整个方法的过程可以简单总结为：</p><ol><li>遍历 service、endpoint 配置</li><li>根据需要找到所有需要设置的 ipvs 规则</li><li>设置 ipvs 规则</li><li>同步 iptables 的配置</li></ol><p>最终我们能通过 <code>ipvsadm -ln</code> 命令看到它配置的具体情况，当然还能通过 <code>ip addr</code> 命令看到 <code>kube-ipvs</code> 的虚拟网络设备</p><figure class="highlight bash"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br></pre></td><td class="code"><pre><span class="line">$ ipvsadm -<span class="built_in">ln</span></span><br><span class="line">.....</span><br><span class="line">TCP  10.94.0.1:443 rr</span><br><span class="line">  -&gt; 192.168.51.101:6443</span><br><span class="line">.....</span><br></pre></td></tr></table></figure><p>总结一下，其实 <code>kube-proxy</code> 做的核心事情非常简单，就是监听并定期更新 service 和 endpoint 对象所需要配置的路由规则，最终的访问就是通过这些路由规则来完成找到对应服务的。</p><h2 id="码后解答"><a href="#码后解答" class="headerlink" title="码后解答"></a>码后解答</h2><ol><li><code>kube-proxy</code> 是什么类型的对象？<ol><li>答案是在 kube-system namespace 下的一个 DaemonSet，其实容易想到的，因为每个网络规则的配置最终应该配置到宿主机上，而每一个节点就对应了一个宿主机，而 DaemonSet 正好每个节点一个，非常适合。</li></ol></li><li><code>ipvs</code> 是由谁来配的？<ol><li>当然就是我们的 <code>kube-proxy</code> 组件咯</li></ol></li><li><code>ipvs</code> 后续如何更新？<ol><li>通过监听并定期更新</li></ol></li></ol><h2 id="总结提升"><a href="#总结提升" class="headerlink" title="总结提升"></a>总结提升</h2><p>其实从 <code>kube-proxy</code> 你更能体会到 Linux 的强大，无论是 docker 还是 k8s 其实最终极的技术本质都是建立在已有的 Linux 功能上的，无论是这里提到的 ipvs 还是 namespace、cgroup 等等都是。当我们慢慢构建整个系统的过程中，会遇到一个又一个问题，寻找并使用已有的技术手段去解决，同时在规模不断变大，会再次出现问题，然后再寻找更好更优秀的解决方案并优化迭代，这就是软件开发的魅力。</p><h3 id="编码上"><a href="#编码上" class="headerlink" title="编码上"></a>编码上</h3><h4 id="OOMAdjuster"><a href="#OOMAdjuster" class="headerlink" title="OOMAdjuster"></a>OOMAdjuster</h4><p>在 ProxyServer 启动的时候有这样一部分代码</p><figure class="highlight go"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br></pre></td><td class="code"><pre><span class="line"><span class="function"><span class="keyword">func</span> <span class="params">(s *ProxyServer)</span></span> Run() <span class="type">error</span> &#123;</span><br><span class="line">    <span class="comment">// To help debugging, immediately log version</span></span><br><span class="line">    klog.InfoS(<span class="string">&quot;Version info&quot;</span>, <span class="string">&quot;version&quot;</span>, version.Get())</span><br><span class="line"></span><br><span class="line">    klog.InfoS(<span class="string">&quot;Golang settings&quot;</span>, <span class="string">&quot;GOGC&quot;</span>, os.Getenv(<span class="string">&quot;GOGC&quot;</span>), <span class="string">&quot;GOMAXPROCS&quot;</span>, os.Getenv(<span class="string">&quot;GOMAXPROCS&quot;</span>), <span class="string">&quot;GOTRACEBACK&quot;</span>, os.Getenv(<span class="string">&quot;GOTRACEBACK&quot;</span>))</span><br><span class="line"></span><br><span class="line">    <span class="comment">// TODO(vmarmol): Use container config for this.</span></span><br><span class="line">    <span class="keyword">var</span> oomAdjuster *oom.OOMAdjuster</span><br><span class="line">    <span class="keyword">if</span> s.Config.OOMScoreAdj != <span class="literal">nil</span> &#123;</span><br><span class="line">        oomAdjuster = oom.NewOOMAdjuster()</span><br><span class="line">        <span class="keyword">if</span> err := oomAdjuster.ApplyOOMScoreAdj(<span class="number">0</span>, <span class="type">int</span>(*s.Config.OOMScoreAdj)); err != <span class="literal">nil</span> &#123;</span><br><span class="line">            klog.V(<span class="number">2</span>).InfoS(<span class="string">&quot;Failed to apply OOMScore&quot;</span>, <span class="string">&quot;err&quot;</span>, err)</span><br><span class="line">        &#125;</span><br><span class="line">    &#125;</span><br></pre></td></tr></table></figure><p>你有没有好奇 <code>oomAdjuster</code> 是什么东西，它是做什么用的？</p><p>首先你需要承认 <code>kube-proxy</code>，是一个非常重要的组件，因为如果网络不通的话，那么其他服务都是“白给”。所以你需要保证它的正常稳定运行，而我们常见的一个机制 OOM(out of memory) 当内存不够的时候是有概率去 Kill 你的应用的，而 <code>OOMAdjuster</code> 目的就是去调整分数，让进程的优先级提高，让自己变得重要，从而让 OOM 机制先 Kill 别人。</p><p>算是一个小的科技，可以记下来，说不定什么时候也能用上。</p><h4 id="birthCry"><a href="#birthCry" class="headerlink" title="birthCry"></a>birthCry</h4><p>还有一个非常形象的方法命名是 <code>birthCry</code></p><figure class="highlight go"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment">// Birth Cry after the birth is successful</span></span><br><span class="line">s.birthCry()</span><br><span class="line"></span><br><span class="line"></span><br><span class="line"><span class="function"><span class="keyword">func</span> <span class="params">(s *ProxyServer)</span></span> birthCry() &#123;</span><br><span class="line">    s.Recorder.Eventf(s.NodeRef, <span class="literal">nil</span>, api.EventTypeNormal, <span class="string">&quot;Starting&quot;</span>, <span class="string">&quot;StartKubeProxy&quot;</span>, <span class="string">&quot;&quot;</span>)</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure><p>就好像刚出生的孩子的第一声啼哭一样，表示整个系统成功启动了。</p>]]>
    </content>
    
    
      
      
        
        
    <summary type="html">&lt;blockquote&gt;
&lt;p&gt;📢 注意，该文本非最终版本，正在更新中，版权所有，请勿转载！！&lt;/p&gt;
&lt;/blockquote&gt;
&lt;h2 id=&quot;前言&quot;&gt;&lt;a href=&quot;#前言&quot; class=&quot;headerlink&quot;</summary>
        
      
    
    
    
    <category term="kubernetes" scheme="https://www.linkinstars.com/categories/kubernetes/"/>
    
    
    <category term="kubernetes" scheme="https://www.linkinstars.com/tags/kubernetes/"/>
    
  </entry>
  
  <entry>
    <title>2024 读书总结</title>
    <link href="https://www.linkinstars.com/post/191c5d53.html"/>
    <id>https://www.linkinstars.com/post/191c5d53.html</id>
    <published>2024-12-30T16:00:00.000Z</published>
    <updated>2024-12-31T09:58:10.641Z</updated>
    
    <content type="html">
    <![CDATA[<a href="https://www.linkinstars.com/post/191c5d53.html">RSS 阅读体验可能不太好，若喜欢本文请点此跳转原文查看~</a><br><br>]]>
    <![CDATA[<blockquote><p>2024 年度读书总结</p></blockquote><h2 id="读书列表"><a href="#读书列表" class="headerlink" title="读书列表"></a>读书列表</h2><blockquote><p>仅罗列，顺序是随机。怕链接会过期，失效的建议直接搜书名。</p></blockquote><p>读了 <a href="https://book.douban.com/subject/34902721/">《如何阅读一本书》</a> ，这次简单按里面分了两个类别，小说和非小说：</p><h3 id="非小说"><a href="#非小说" class="headerlink" title="非小说"></a>非小说</h3><ul><li><a href="https://book.douban.com/subject/36667173/">《深入理解 Go 并发编程：从原理到实践，看这本就够了》</a></li><li><a href="https://book.douban.com/subject/6021440/">《黑客与画家》</a></li><li><a href="https://book.douban.com/subject/36715297/">《100 个 Go 语言典型错误》</a></li><li><a href="https://book.douban.com/subject/35876121/">《纳瓦尔宝典》</a></li><li><a href="https://book.douban.com/subject/30329536/">《数据密集型应用系统设计》</a></li><li><a href="https://colobu.com/gotips/preface.html">《Go 语言编程技巧》</a></li><li><a href="https://book.douban.com/subject/30155731/">《算法之美》</a></li><li><a href="https://book.douban.com/subject/35135787/">《重来 3 跳出疯狂的忙碌》</a></li><li><a href="https://book.douban.com/subject/5914587/">《启示录：打造用户喜爱的产品 》</a></li><li><a href="https://book.douban.com/subject/35339729/">《微信背后的产品观》</a></li><li><a href="https://book.douban.com/subject/36518892/">《埃隆马斯克传》</a></li><li><a href="https://book.douban.com/subject/36804982/">《架构思维》</a></li><li><a href="https://book.douban.com/subject/36864478/">《亿级流量系统架构设计与实战》</a></li><li><a href="https://book.douban.com/subject/35231266/">《MySQL 是怎样运行的》</a></li><li><a href="https://book.douban.com/subject/36830213/">《containerd 原理剖析和实战》</a></li><li><a href="https://book.douban.com/subject/35006892/">《程序员修炼之道》</a></li><li><a href="https://book.douban.com/subject/36819053/">《构建底层逻辑》</a></li><li><a href="https://book.douban.com/subject/36807496/">《码农翻身 2》</a></li><li><a href="https://book.douban.com/subject/26612471/">《搞定 Ⅰ》</a></li><li><a href="https://book.douban.com/subject/36787667/">《Google 的軟體工程之道》</a></li></ul><h3 id="小说"><a href="#小说" class="headerlink" title="小说"></a>小说</h3><ul><li><a href="https://book.douban.com/subject/25955474/">《坏小孩》</a></li><li><a href="https://book.douban.com/subject/30176554/">《暗黑者四部曲》</a></li><li><a href="https://book.douban.com/subject/25799686/">《无证之罪》</a></li><li><a href="https://book.douban.com/subject/26923390/">《长夜难明》</a></li><li><a href="https://book.douban.com/subject/35016085/">《诡计博物馆》</a></li><li><a href="https://book.douban.com/subject/35578935/">《非常疑犯》</a></li><li><a href="https://book.douban.com/subject/36540441/">《宋慈洗冤笔记》</a></li></ul><h2 id="2024-最佳"><a href="#2024-最佳" class="headerlink" title="2024 最佳"></a>2024 最佳</h2><blockquote><p>并不是说其他的书不好，而是它对我的影响最大，印象最深刻</p></blockquote><p><mark style="background: #FFB86CA6;">《埃隆马斯克传》</mark></p><p>今年挺纠结的，想了想还是给它，其实其中对我产生影响的就是几个思维模式的改变，其他有的没的内容也都是听听就好，毕竟成功无法复制，而八卦也只是传记的一部分。当然我还是非常推荐读的，特别是对于工程师来说。其中的五步工作法直接被我简化为：<strong>强行删除、优化迭代、自动化</strong>。放入到实际工作中，我现在会经常问自己一句话，这部分删掉行不行？</p><h2 id="打卡日历"><a href="#打卡日历" class="headerlink" title="打卡日历"></a>打卡日历</h2><p><img src="https://blog.linkinstars.com/blog/reading-2024-DailyCards.png" alt="LinkinStar&#39;s DailyCards"></p><p>看到打卡日历简单回顾了下 2024 读书和摸鱼的日子，除了放假的时间，年中部分的周末都在摸鱼看小说。今年还有两本比较重(要)的书买了还来不及啃，放到明年去了，可以好好期待一下，都是非常不错的技术书。</p><p>今年其实有重新读一些书，比如熟知的 <code>DDIA</code>，渐渐发现了有的书在不同的时间读就是会有不同的收获。然后也有读一些 AI 的书，但要么就是深奥至极，完全无用，要么就是知识普及，随着发展立刻就失效(感叹 AI 迭代的速度)，最后都不如直接用工具来的实在。</p><h2 id="新年新气象"><a href="#新年新气象" class="headerlink" title="新年新气象"></a>新年新气象</h2><p>今年还发现一下质量很不错的播客，回头整理一下明年可以一起打卡。依旧还是和以前一样，现在已经没了新年计划的习惯，新的一年准备让自己换换脑子，改变一下。且将新火试新茶，诗酒趁年华。</p>]]>
    </content>
    
    
      
      
        
        
    <summary type="html">&lt;blockquote&gt;
&lt;p&gt;2024 年度读书总结&lt;/p&gt;
&lt;/blockquote&gt;
&lt;h2 id=&quot;读书列表&quot;&gt;&lt;a href=&quot;#读书列表&quot; class=&quot;headerlink&quot;</summary>
        
      
    
    
    
    <category term="reading" scheme="https://www.linkinstars.com/categories/reading/"/>
    
    
    <category term="reading" scheme="https://www.linkinstars.com/tags/reading/"/>
    
  </entry>
  
  <entry>
    <title>k8s pod 之间是如何通过 DNS 访问的</title>
    <link href="https://www.linkinstars.com/post/9f1fcba7.html"/>
    <id>https://www.linkinstars.com/post/9f1fcba7.html</id>
    <published>2024-12-19T16:00:00.000Z</published>
    <updated>2024-12-26T10:05:43.973Z</updated>
    
    <content type="html">
    <![CDATA[<a href="https://www.linkinstars.com/post/9f1fcba7.html">RSS 阅读体验可能不太好，若喜欢本文请点此跳转原文查看~</a><br><br>]]>
    <![CDATA[<p>最近一直没有更新 k8s 源码的部分是因为看到这部分的时候发现对于 service 的一些细节还并不是非常的清晰，所以从原理的角度上来看有的地方不是特别好解释清楚。所以需要先补充一些小细节以帮助消化。<del>嗯，这是一个非常让人同情的拖更理由了。</del></p><p>我之前有写过 <a href="https://www.linkinstars.com/post/43a15dd1.html">K8S 之跨主机通信</a> 是深入到了 ip 包内部的路由，但是回过头发现好像没有记录从应用层的流转是如何做的。也就是说，在 k8s 内部，pod1 访问 pod2 可以如何进行访问呢？这里面的链路是怎么走的呢？其实非常简单，让我们今天一起来看看。</p><h2 id="如何访问"><a href="#如何访问" class="headerlink" title="如何访问"></a>如何访问</h2><p>首先，k8s 官方文档中给出了访问的方式。</p><h3 id="相同的-namespace-下"><a href="#相同的-namespace-下" class="headerlink" title="相同的 namespace 下"></a>相同的 namespace 下</h3><p>在相同的 namespace 下，我们可以很容易的通过对应 pod 的 service 的 name 来访问，比如 pod1 的 service 的 name 是 pod1-service。我们就可以通过</p><figure class="highlight bash"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line">$ curl http://pod1-service:port/</span><br></pre></td></tr></table></figure><p>来访问，其中端口就是 service 申明的 port</p><h3 id="不同的-namespace-下"><a href="#不同的-namespace-下" class="headerlink" title="不同的 namespace 下"></a>不同的 namespace 下</h3><p>而如果两个 pod 在不同的 namespace 下，也简单，只需要加上 namespace 的名称就可以了。</p><figure class="highlight bash"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line">$ curl http://pod1-service.pod1-namespace:port/</span><br></pre></td></tr></table></figure><p>其中 pod1-namespace 就是 pod1 所在的 namespace</p><h2 id="为什么可以访问？"><a href="#为什么可以访问？" class="headerlink" title="为什么可以访问？"></a>为什么可以访问？</h2><p>那么为什么通过一个简单的名字就可以访问了呢？</p><h3 id="寻找路由"><a href="#寻找路由" class="headerlink" title="寻找路由"></a>寻找路由</h3><p>首先我们来看看路由，对于我们访问的整个 url 地址，那么端口前面的一定是域名了，也就是说在 k8s 内部可以对这样的域名做解析。</p><figure class="highlight bash"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br></pre></td><td class="code"><pre><span class="line">$ ping pod-1</span><br><span class="line">PING pod-1 (10.105.44.235): 56 data bytes</span><br><span class="line">64 bytes from 10.105.44.235: <span class="built_in">seq</span>=0 ttl=64 <span class="keyword">time</span>=0.187 ms</span><br><span class="line">64 bytes from 10.105.44.235: <span class="built_in">seq</span>=1 ttl=64 <span class="keyword">time</span>=0.056 ms</span><br></pre></td></tr></table></figure><p>可以看到解析到了对应 pod 的 ip。<code>ping</code> 还不够清楚，我们用 <code>traceroute</code> 来看看。</p><figure class="highlight bash"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br></pre></td><td class="code"><pre><span class="line">$ traceroute pod-1</span><br><span class="line">traceroute to pod-1 (10.105.44.235), 30 hops max, 46 byte packets</span><br><span class="line"> 1  pod-1.pod1-namepsace.svc.cluster.local (10.105.44.235)  0.018 ms  0.024 ms  0.014 ms</span><br></pre></td></tr></table></figure><p>可以非常清楚的看到，先被解析到了 <code>pod-1.pod1-namepsace.svc.cluster.local</code> 这样一个地址。你这个地址是哪里来的呢？那就要问问 DNS 了。</p><figure class="highlight bash"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br></pre></td><td class="code"><pre><span class="line">$ <span class="built_in">cat</span> /etc/resolv.conf</span><br><span class="line">nameserver 10.96.0.10</span><br><span class="line">search pod1-namepsace.svc.cluster.local svc.cluster.local cluster.local</span><br><span class="line">options ndots:5</span><br></pre></td></tr></table></figure><p>可以看到我们的域名服务 <code>nameserver 10.96.0.10</code> 并且会尝试搜索下面 <code>search</code> 的名字，举例来说：</p><figure class="highlight bash"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br></pre></td><td class="code"><pre><span class="line">$ nslookup pod-1</span><br><span class="line">Server:         10.96.0.10</span><br><span class="line">Address:        10.96.0.10:53</span><br><span class="line"></span><br><span class="line">** server can<span class="string">&#x27;t find pod-1.cluster.local: NXDOMAIN</span></span><br><span class="line"><span class="string"></span></span><br><span class="line"><span class="string">** server can&#x27;</span>t find pod-1.cluster.local: NXDOMAIN</span><br><span class="line"></span><br><span class="line">** server can<span class="string">&#x27;t find pod-1.svc.cluster.local: NXDOMAIN</span></span><br><span class="line"><span class="string"></span></span><br><span class="line"><span class="string"></span></span><br><span class="line"><span class="string">** server can&#x27;</span>t find pod-1.svc.cluster.local: NXDOMAIN</span><br><span class="line"></span><br><span class="line">Name:   pod-1.pod1-namespace.svc.cluster.local</span><br><span class="line">Address: 10.105.44.235</span><br></pre></td></tr></table></figure><p>关键来了 <code>10.96.0.10</code> 是个什么服务呢？</p><h2 id="CoreDNS"><a href="#CoreDNS" class="headerlink" title="CoreDNS"></a>CoreDNS</h2><p>然后我们就可以去 kube-system 的 namespace 下找找这样的 service ，于是乎你就会发现有一个 <code>kube-dns</code> 的 service 它的 clusterIP 就是 <code>10.96.0.10</code> 我们在 resolv.conf 里面看到的这个。那么它对应的 pod 服务也就是我们的 <code>coreDNS</code> 了。通过</p><figure class="highlight yaml"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br></pre></td><td class="code"><pre><span class="line"><span class="attr">labels:</span></span><br><span class="line">    <span class="attr">k8s-app:</span> <span class="string">kube-dns</span></span><br></pre></td></tr></table></figure><p>关联上了。</p><h2 id="总结"><a href="#总结" class="headerlink" title="总结"></a>总结</h2><p>其实总的链路并不复杂：</p><ol><li>域名</li><li>查看 <code>/etc/resolv.conf</code> 找到了 DNS 服务器</li><li>访问服务器也就是 CoreDNS 服务</li><li>服务返回了对应 DNS 的 ip 记录</li><li>最终通过 ip 访问到服务</li></ol><p>这就是 k8s pod 之间是通过 DNS 访问的链路了。理解了它，无论是使用还是排查相信都能帮助到你。</p>]]>
    </content>
    
    
      
      
        
        
    <summary type="html">&lt;p&gt;最近一直没有更新 k8s 源码的部分是因为看到这部分的时候发现对于 service</summary>
        
      
    
    
    
    <category term="kubernetes" scheme="https://www.linkinstars.com/categories/kubernetes/"/>
    
    
    <category term="coreDNS" scheme="https://www.linkinstars.com/tags/coreDNS/"/>
    
  </entry>
  
  <entry>
    <title>推荐最近几个 Mac 上的小应用</title>
    <link href="https://www.linkinstars.com/post/97ae1131.html"/>
    <id>https://www.linkinstars.com/post/97ae1131.html</id>
    <published>2024-12-09T16:00:00.000Z</published>
    <updated>2025-01-13T08:42:54.715Z</updated>
    
    <content type="html">
    <![CDATA[<a href="https://www.linkinstars.com/post/97ae1131.html">RSS 阅读体验可能不太好，若喜欢本文请点此跳转原文查看~</a><br><br>]]>
    <![CDATA[<p>最近手边有几个用着不错的小应用可以拿出来分享下</p><h2 id="应用"><a href="#应用" class="headerlink" title="应用"></a>应用</h2><ul><li><a href="https://inputsource.pro/zh-CN">Input Source PRO</a> 在切换输入法时显示当前输入法</li><li><a href="https://www.homerow.app/">homerow</a> 使用键盘全操作桌面</li><li><a href="https://github.com/trzsz/trzsz-ssh">tssh</a> 方法管理 ssh 设备，这个不只是 mac 可以用</li><li><a href="https://powerkeys.github.io/">Power Keys</a> 增强的快捷键，在 mac 上需要借助 Karabiner-Elements 实现，由于我原来就用所以非常方便</li></ul><h2 id="Input-Source-PRO"><a href="#Input-Source-PRO" class="headerlink" title="Input Source PRO"></a>Input Source PRO</h2><ul><li><a href="https://inputsource.pro/zh-CN">https://inputsource.pro/zh-CN</a></li></ul><p>如果你经常有中英文同时输入的场景，你一定会遇到这个问题，就是在进入一个软件，特别是全屏的 IDE，你不知道你用的是什么输入法，一看输入的是英文，就去切换，结果还是英文，再切换还是，然后发现刚才其实是中文输入法需要按一下 shift 就可以了…. 这个软件可以解决这个问题，在进入一个应用的使用告诉你当前的输入法是什么，并且可以提示不同颜色，还能根据应用切换输入法、自定义快捷键等等。总之是解决了我切换输入法的问题。</p><h2 id="homerow"><a href="#homerow" class="headerlink" title="homerow"></a>homerow</h2><p><a href="https://www.homerow.app/">https://www.homerow.app/</a></p><p>之前我一直用的是 <a href="https://github.com/nchudleigh/vimac/">https://github.com/nchudleigh/vimac/</a> 不过操作过程中发现有时候会有问题，无法点击到一些可以点击的位置，而这个 homerow 很不错。对于键盘使用更频繁的我来说非常不错，在非必要的时候几乎可以不用鼠标了。并且我很喜欢他的滚动，避免原来滚动时被迫需要鼠标的尴尬。非常不错。而且不付费也仅仅是偶尔会使用过程中有提醒而已，和付费几乎没区别，我可以接受。</p><h2 id="tssh"><a href="#tssh" class="headerlink" title="tssh"></a>tssh</h2><p><a href="https://github.com/trzsz/trzsz-ssh">https://github.com/trzsz/trzsz-ssh</a></p><p>这个其实不用多说，用了确实不错，云厂商的服务器最近手边有点多，用起来挺方便的。</p><p><img src="https://camo.githubusercontent.com/0a0114ff30ff6057defd516aa85681fa5ea59e2f3923969a5768305dee43d78f/68747470733a2f2f74727a737a2e6769746875622e696f2f696d616765732f747373685f74696e792e676966" alt="tssh-img"></p><h2 id="Power-Keys"><a href="#Power-Keys" class="headerlink" title="Power Keys"></a>Power Keys</h2><p><a href="https://powerkeys.github.io/">https://powerkeys.github.io/</a></p><p>官方的介绍非常简洁，我就不多说了，它可以实现一些快捷键的增强。我非常喜欢其中的 F1 文件夹的实现，特别方便，除了平常的启动快捷键之外，其实可以非常方便的将常用的 app 放进需要的文件夹中。还有一个是空格的快速编辑，我还没特别熟悉，但也是非常方便，然后其他功能比如数字输入什么的我都关掉了，和我现在的使用有冲突。总之，你可以试试看看，不喜欢也可以直接移除，非常方便。</p>]]>
    </content>
    
    
      
      
        
        
    <summary type="html">&lt;p&gt;最近手边有几个用着不错的小应用可以拿出来分享下&lt;/p&gt;
&lt;h2 id=&quot;应用&quot;&gt;&lt;a href=&quot;#应用&quot; class=&quot;headerlink&quot; title=&quot;应用&quot;&gt;&lt;/a&gt;应用&lt;/h2&gt;&lt;ul&gt;
&lt;li&gt;&lt;a</summary>
        
      
    
    
    
    <category term="macos-hint" scheme="https://www.linkinstars.com/categories/macos-hint/"/>
    
    
    <category term="app" scheme="https://www.linkinstars.com/tags/app/"/>
    
  </entry>
  
  <entry>
    <title>Golang 捕获意外丢失的 Panic 日志</title>
    <link href="https://www.linkinstars.com/post/3a89be09.html"/>
    <id>https://www.linkinstars.com/post/3a89be09.html</id>
    <published>2024-11-30T16:00:00.000Z</published>
    <updated>2024-12-26T11:57:06.817Z</updated>
    
    <content type="html">
    <![CDATA[<a href="https://www.linkinstars.com/post/3a89be09.html">RSS 阅读体验可能不太好，若喜欢本文请点此跳转原文查看~</a><br><br>]]>
    <![CDATA[<h2 id="前言"><a href="#前言" class="headerlink" title="前言"></a>前言</h2><p>你是否曾经遇到过 Golang 应用意外重启但没有任何错误日志停留，怀疑出现 <code>panic</code> 但是苦于没有证据？那么今天就来解决一下可能出现的一个问题，捕获哪些可能被我遗漏的 panic 日志。</p><h2 id="原因"><a href="#原因" class="headerlink" title="原因"></a>原因</h2><p>首先我们要清楚一下出现这样情况的可能原因是什么。(之所以我说可能的原因，是因为不是所有没有日志的情况都是这个原因) 。对于日志的处理方式，通常我们会使用一些日志框架帮助我们进行处理，特别是对于文件，将日志打印到文件中的情况。<strong>而当一个 panic 出现的时候，如果我们没有捕获它，它的错误信息只会被打印到 <code>stderr</code> 中去，也就是标准错误输出。</strong></p><p>所以，对于一些容器化部署的项目来说，特别是部署到 k8s 上的时候，会统一使用 std 作为日志捕获和处理的手段，无论是采用 sidecar 还是其他方式采集，将 std 的输出直接采集发送到日志系统，所有日志系统进行归档统一处理就一般没啥问题。而对于一些仅使用二进制或者是 docker 部署的情况，又或是没有采用日志采集 std 的时候就可能会出现了。</p><h2 id="实验"><a href="#实验" class="headerlink" title="实验"></a>实验</h2><p>这里我使用 zap 作为日志组件引入，如果你不熟悉可以直接忽略这部分的代码。然后快速实验一下，对于日志文件是否有 panic 记录。</p><figure class="highlight go"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br><span class="line">29</span><br><span class="line">30</span><br><span class="line">31</span><br><span class="line">32</span><br><span class="line">33</span><br><span class="line">34</span><br><span class="line">35</span><br><span class="line">36</span><br><span class="line">37</span><br><span class="line">38</span><br><span class="line">39</span><br><span class="line">40</span><br><span class="line">41</span><br><span class="line">42</span><br><span class="line">43</span><br><span class="line">44</span><br><span class="line">45</span><br><span class="line">46</span><br><span class="line">47</span><br><span class="line">48</span><br><span class="line">49</span><br><span class="line">50</span><br><span class="line">51</span><br><span class="line">52</span><br><span class="line">53</span><br><span class="line">54</span><br><span class="line">55</span><br><span class="line">56</span><br></pre></td><td class="code"><pre><span class="line"><span class="keyword">package</span> main</span><br><span class="line"></span><br><span class="line"><span class="keyword">import</span> (</span><br><span class="line">    <span class="string">&quot;fmt&quot;</span></span><br><span class="line">    rotatelogs <span class="string">&quot;github.com/lestrrat-go/file-rotatelogs&quot;</span></span><br><span class="line">    <span class="string">&quot;go.uber.org/zap&quot;</span></span><br><span class="line">    <span class="string">&quot;go.uber.org/zap/zapcore&quot;</span></span><br><span class="line">    <span class="string">&quot;os&quot;</span></span><br><span class="line">    <span class="string">&quot;runtime&quot;</span></span><br><span class="line">    <span class="string">&quot;syscall&quot;</span></span><br><span class="line">    <span class="string">&quot;time&quot;</span></span><br><span class="line">)</span><br><span class="line"></span><br><span class="line"><span class="function"><span class="keyword">func</span> <span class="title">main</span><span class="params">()</span></span> &#123;</span><br><span class="line">    logger := initLogger()</span><br><span class="line">    logger.Debug(<span class="string">&quot;debug&quot;</span>)</span><br><span class="line">    logger.Info(<span class="string">&quot;info&quot;</span>)</span><br><span class="line">    logger.Warn(<span class="string">&quot;warn&quot;</span>)</span><br><span class="line">    logger.Error(<span class="string">&quot;error&quot;</span>)</span><br><span class="line"></span><br><span class="line">    <span class="keyword">go</span> <span class="function"><span class="keyword">func</span><span class="params">()</span></span> &#123;</span><br><span class="line">        <span class="built_in">panic</span>(<span class="string">&quot;exit&quot;</span>)</span><br><span class="line">    &#125;()</span><br><span class="line"></span><br><span class="line">    time.Sleep(time.Second * <span class="number">3</span>)</span><br><span class="line">&#125;</span><br><span class="line"></span><br><span class="line"><span class="function"><span class="keyword">func</span> <span class="title">initLogger</span><span class="params">()</span></span> *zap.Logger &#123;</span><br><span class="line">    errWriter, err := rotatelogs.New(</span><br><span class="line">        <span class="string">&quot;linkinstar_err_%Y-%m-%d.log&quot;</span>,</span><br><span class="line">    )</span><br><span class="line">    <span class="keyword">if</span> err != <span class="literal">nil</span> &#123;</span><br><span class="line">        <span class="built_in">panic</span>(err)</span><br><span class="line">    &#125;</span><br><span class="line"></span><br><span class="line">    consoleDebugging := zapcore.Lock(os.Stdout)</span><br><span class="line">    consoleEncoderConfig := zap.NewDevelopmentEncoderConfig()</span><br><span class="line">    consoleEncoderConfig.EncodeTime = zapcore.ISO8601TimeEncoder</span><br><span class="line">    consoleEncoderConfig.EncodeLevel = zapcore.CapitalColorLevelEncoder</span><br><span class="line">    consoleEncoder := zapcore.NewConsoleEncoder(consoleEncoderConfig)</span><br><span class="line">    consoleCore := zapcore.NewCore(consoleEncoder, consoleDebugging, zapcore.DebugLevel)</span><br><span class="line"></span><br><span class="line">    errorCore := zapcore.AddSync(errWriter)</span><br><span class="line">    fileEncodeConfig := zap.NewProductionEncoderConfig()</span><br><span class="line">    fileEncodeConfig.EncodeTime = zapcore.ISO8601TimeEncoder</span><br><span class="line">    fileEncoder := zapcore.NewConsoleEncoder(fileEncodeConfig)</span><br><span class="line">    lowPriority := zap.LevelEnablerFunc(<span class="function"><span class="keyword">func</span><span class="params">(lvl zapcore.Level)</span></span> <span class="type">bool</span> &#123;</span><br><span class="line">        <span class="keyword">return</span> lvl &gt;= zapcore.DebugLevel</span><br><span class="line">    &#125;)</span><br><span class="line"></span><br><span class="line">    core := zapcore.NewTee(consoleCore, zapcore.NewCore(fileEncoder, errorCore, lowPriority))</span><br><span class="line">    caller := zap.AddCaller()</span><br><span class="line">    logger := zap.New(core, caller, zap.Development())</span><br><span class="line">    zap.ReplaceGlobals(logger)</span><br><span class="line">    <span class="keyword">return</span> logger</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure><p>可以看到日志文件里面输出的是：</p><figure class="highlight bash"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br></pre></td><td class="code"><pre><span class="line">2024-12-10T15:56:15.505+0800debugmain.go:15debug</span><br><span class="line">2024-12-10T15:56:15.506+0800infomain.go:16info</span><br><span class="line">2024-12-10T15:56:15.506+0800warnmain.go:17warn</span><br><span class="line">2024-12-10T15:56:15.506+0800errormain.go:18error</span><br></pre></td></tr></table></figure><p>并没有 panic 日志。而在实际项目中，如果有第三方的依赖意外导致了 panic 那么就很有可能是这样的情况，容器重启，但没有任何现场日志保留。</p><h2 id="解决方案"><a href="#解决方案" class="headerlink" title="解决方案"></a>解决方案</h2><p>说起来很简单，也是网上来的方案，非常行之有效，<strong>就是将 stderr 重写到你需要的文件里面</strong>。即下面的 <code>RewriteStderrFile</code></p><figure class="highlight go"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br><span class="line">29</span><br><span class="line">30</span><br><span class="line">31</span><br><span class="line">32</span><br><span class="line">33</span><br><span class="line">34</span><br><span class="line">35</span><br><span class="line">36</span><br><span class="line">37</span><br><span class="line">38</span><br><span class="line">39</span><br><span class="line">40</span><br><span class="line">41</span><br><span class="line">42</span><br><span class="line">43</span><br><span class="line">44</span><br><span class="line">45</span><br><span class="line">46</span><br><span class="line">47</span><br><span class="line">48</span><br><span class="line">49</span><br><span class="line">50</span><br><span class="line">51</span><br><span class="line">52</span><br><span class="line">53</span><br><span class="line">54</span><br><span class="line">55</span><br><span class="line">56</span><br><span class="line">57</span><br><span class="line">58</span><br><span class="line">59</span><br><span class="line">60</span><br><span class="line">61</span><br><span class="line">62</span><br><span class="line">63</span><br><span class="line">64</span><br><span class="line">65</span><br><span class="line">66</span><br><span class="line">67</span><br><span class="line">68</span><br><span class="line">69</span><br><span class="line">70</span><br><span class="line">71</span><br><span class="line">72</span><br><span class="line">73</span><br><span class="line">74</span><br><span class="line">75</span><br><span class="line">76</span><br><span class="line">77</span><br><span class="line">78</span><br><span class="line">79</span><br><span class="line">80</span><br><span class="line">81</span><br><span class="line">82</span><br><span class="line">83</span><br><span class="line">84</span><br><span class="line">85</span><br><span class="line">86</span><br></pre></td><td class="code"><pre><span class="line"><span class="keyword">package</span> main</span><br><span class="line"></span><br><span class="line"><span class="keyword">import</span> (</span><br><span class="line">    <span class="string">&quot;fmt&quot;</span></span><br><span class="line">    rotatelogs <span class="string">&quot;github.com/lestrrat-go/file-rotatelogs&quot;</span></span><br><span class="line">    <span class="string">&quot;go.uber.org/zap&quot;</span></span><br><span class="line">    <span class="string">&quot;go.uber.org/zap/zapcore&quot;</span></span><br><span class="line">    <span class="string">&quot;log&quot;</span></span><br><span class="line">    <span class="string">&quot;os&quot;</span></span><br><span class="line">    <span class="string">&quot;runtime&quot;</span></span><br><span class="line">    <span class="string">&quot;syscall&quot;</span></span><br><span class="line">    <span class="string">&quot;time&quot;</span></span><br><span class="line">)</span><br><span class="line"></span><br><span class="line"><span class="function"><span class="keyword">func</span> <span class="title">main</span><span class="params">()</span></span> &#123;</span><br><span class="line">    RewriteStderrFile(<span class="string">&quot;linkinstar_panic.log&quot;</span>)</span><br><span class="line"></span><br><span class="line">    logger := initLogger()</span><br><span class="line">    logger.Debug(<span class="string">&quot;debug&quot;</span>)</span><br><span class="line">    logger.Info(<span class="string">&quot;info&quot;</span>)</span><br><span class="line">    logger.Warn(<span class="string">&quot;warn&quot;</span>)</span><br><span class="line">    logger.Error(<span class="string">&quot;error&quot;</span>)</span><br><span class="line"></span><br><span class="line">    log.Println(<span class="string">&quot;print to std&quot;</span>)</span><br><span class="line">    fmt.Fprintf(os.Stderr, <span class="string">&quot;print to stderr\n&quot;</span>)</span><br><span class="line"></span><br><span class="line">    <span class="keyword">go</span> <span class="function"><span class="keyword">func</span><span class="params">()</span></span> &#123;</span><br><span class="line">        <span class="built_in">panic</span>(<span class="string">&quot;exit&quot;</span>)</span><br><span class="line">    &#125;()</span><br><span class="line"></span><br><span class="line">    time.Sleep(time.Second * <span class="number">3</span>)</span><br><span class="line">&#125;</span><br><span class="line"></span><br><span class="line"><span class="function"><span class="keyword">func</span> <span class="title">initLogger</span><span class="params">()</span></span> *zap.Logger &#123;</span><br><span class="line">    errWriter, err := rotatelogs.New(</span><br><span class="line">        <span class="string">&quot;linkinstar_err_%Y-%m-%d.log&quot;</span>,</span><br><span class="line">    )</span><br><span class="line">    <span class="keyword">if</span> err != <span class="literal">nil</span> &#123;</span><br><span class="line">        <span class="built_in">panic</span>(err)</span><br><span class="line">    &#125;</span><br><span class="line"></span><br><span class="line">    consoleDebugging := zapcore.Lock(os.Stdout)</span><br><span class="line">    consoleEncoderConfig := zap.NewDevelopmentEncoderConfig()</span><br><span class="line">    consoleEncoderConfig.EncodeTime = zapcore.ISO8601TimeEncoder</span><br><span class="line">    consoleEncoderConfig.EncodeLevel = zapcore.CapitalColorLevelEncoder</span><br><span class="line">    consoleEncoder := zapcore.NewConsoleEncoder(consoleEncoderConfig)</span><br><span class="line">    consoleCore := zapcore.NewCore(consoleEncoder, consoleDebugging, zapcore.DebugLevel)</span><br><span class="line"></span><br><span class="line">    errorCore := zapcore.AddSync(errWriter)</span><br><span class="line">    fileEncodeConfig := zap.NewProductionEncoderConfig()</span><br><span class="line">    fileEncodeConfig.EncodeTime = zapcore.ISO8601TimeEncoder</span><br><span class="line">    fileEncoder := zapcore.NewConsoleEncoder(fileEncodeConfig)</span><br><span class="line">    lowPriority := zap.LevelEnablerFunc(<span class="function"><span class="keyword">func</span><span class="params">(lvl zapcore.Level)</span></span> <span class="type">bool</span> &#123;</span><br><span class="line">        <span class="keyword">return</span> lvl &gt;= zapcore.DebugLevel</span><br><span class="line">    &#125;)</span><br><span class="line"></span><br><span class="line">    core := zapcore.NewTee(consoleCore, zapcore.NewCore(fileEncoder, errorCore, lowPriority))</span><br><span class="line">    caller := zap.AddCaller()</span><br><span class="line">    logger := zap.New(core, caller, zap.Development())</span><br><span class="line">    zap.ReplaceGlobals(logger)</span><br><span class="line">    <span class="keyword">return</span> logger</span><br><span class="line">&#125;</span><br><span class="line"></span><br><span class="line"><span class="comment">// 从此处开始为解决方案</span></span><br><span class="line"><span class="keyword">var</span> stdErrFileHandler *os.File</span><br><span class="line"></span><br><span class="line"><span class="function"><span class="keyword">func</span> <span class="title">RewriteStderrFile</span><span class="params">(panicFilePath <span class="type">string</span>)</span></span> <span class="type">error</span> &#123;</span><br><span class="line">    <span class="keyword">if</span> runtime.GOOS == <span class="string">&quot;windows&quot;</span> &#123;</span><br><span class="line">        <span class="keyword">return</span> <span class="literal">nil</span></span><br><span class="line">    &#125;</span><br><span class="line"></span><br><span class="line">    file, err := os.OpenFile(panicFilePath, os.O_RDWR|os.O_CREATE|os.O_APPEND, <span class="number">0666</span>)</span><br><span class="line">    <span class="keyword">if</span> err != <span class="literal">nil</span> &#123;</span><br><span class="line">        fmt.Println(err)</span><br><span class="line">        <span class="keyword">return</span> err</span><br><span class="line">    &#125;</span><br><span class="line">    stdErrFileHandler = file</span><br><span class="line">    <span class="keyword">if</span> err = syscall.Dup2(<span class="type">int</span>(file.Fd()), <span class="type">int</span>(os.Stderr.Fd())); err != <span class="literal">nil</span> &#123;</span><br><span class="line">        fmt.Println(err)</span><br><span class="line">        <span class="keyword">return</span> err</span><br><span class="line">    &#125;</span><br><span class="line">    runtime.SetFinalizer(stdErrFileHandler, <span class="function"><span class="keyword">func</span><span class="params">(fd *os.File)</span></span> &#123;</span><br><span class="line">        fd.Close()</span><br><span class="line">    &#125;)</span><br><span class="line">    <span class="keyword">return</span> <span class="literal">nil</span></span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure><p>这样我们就会额外得到一个 <code>panic.log</code></p><figure class="highlight bash"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br></pre></td><td class="code"><pre><span class="line">2024/12/10 16:01:03 <span class="built_in">print</span> to std</span><br><span class="line"><span class="built_in">print</span> to stderr</span><br><span class="line">panic: <span class="built_in">exit</span></span><br><span class="line"></span><br><span class="line">goroutine 35 [running]:</span><br><span class="line">main.main.func1()</span><br><span class="line">xxx/main.go:28 +0x2c</span><br><span class="line">created by main.main <span class="keyword">in</span> goroutine 1</span><br><span class="line">xxx/main.go:27 +0x138</span><br></pre></td></tr></table></figure><p>方案本质是利用 <code>syscall.Dup2</code> 方法重定向输出。注意 Windows 上没有这个方法，文末链接中有解决方案，我手边暂时没有设备可测。</p><h2 id="其他提醒"><a href="#其他提醒" class="headerlink" title="其他提醒"></a>其他提醒</h2><p>以上的解决方案是一个保底，在一些无法预知情况下帮你兜底。而实际中的最佳实践中你依旧需要保证，在可能出现 panic 的位置提前主动 recover 进行处理。</p><h2 id="参考链接"><a href="#参考链接" class="headerlink" title="参考链接"></a>参考链接</h2><ul><li><a href="https://stackoverflow.com/questions/34772012/capturing-panic-in-golang">https://stackoverflow.com/questions/34772012/capturing-panic-in-golang</a> 这里也提到了 Windows 平台下的解决方案</li><li><a href="https://juejin.cn/post/6872582459205582862">https://juejin.cn/post/6872582459205582862</a></li></ul>]]>
    </content>
    
    
      
      
        
        
    <summary type="html">&lt;h2 id=&quot;前言&quot;&gt;&lt;a href=&quot;#前言&quot; class=&quot;headerlink&quot; title=&quot;前言&quot;&gt;&lt;/a&gt;前言&lt;/h2&gt;&lt;p&gt;你是否曾经遇到过 Golang 应用意外重启但没有任何错误日志停留，怀疑出现 &lt;code&gt;panic&lt;/code&gt;</summary>
        
      
    
    
    
    <category term="golang" scheme="https://www.linkinstars.com/categories/golang/"/>
    
    
    <category term="panic" scheme="https://www.linkinstars.com/tags/panic/"/>
    
  </entry>
  
  <entry>
    <title>Golang 下载文件时重命名</title>
    <link href="https://www.linkinstars.com/post/7a16ecab.html"/>
    <id>https://www.linkinstars.com/post/7a16ecab.html</id>
    <published>2024-11-29T16:00:00.000Z</published>
    <updated>2024-12-11T08:48:45.928Z</updated>
    
    <content type="html">
    <![CDATA[<a href="https://www.linkinstars.com/post/7a16ecab.html">RSS 阅读体验可能不太好，若喜欢本文请点此跳转原文查看~</a><br><br>]]>
    <![CDATA[<h2 id="前言"><a href="#前言" class="headerlink" title="前言"></a>前言</h2><p>通常用户上传文件之后会进行重命名(随机)，而下载的时候如果还是使用重命名的话对用户来说保存下来不够友好，所以在下载的时候我们通常会对于名字再次做重命名，方便用户下载之后进行分类和使用。</p><h2 id="前端实现"><a href="#前端实现" class="headerlink" title="前端实现"></a>前端实现</h2><p>当然如果使用 a 标签实现，我们可以非常快速的通过 download 属性来实现，主流的浏览器也支持了这个属性。</p><figure class="highlight html"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line"><span class="tag">&lt;<span class="name">a</span> <span class="attr">href</span>=<span class="string">&quot;link/to/your/download/file&quot;</span> <span class="attr">download</span>=<span class="string">&quot;filename&quot;</span>&gt;</span>Download link<span class="tag">&lt;/<span class="name">a</span>&gt;</span></span><br></pre></td></tr></table></figure><h2 id="后端实现"><a href="#后端实现" class="headerlink" title="后端实现"></a>后端实现</h2><p>其实通过后端来实现也非常简单，以 gin 框架举例，可以直接使用 <code>FileAttachment</code> 方法。</p><figure class="highlight go"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br></pre></td><td class="code"><pre><span class="line">r.GET(<span class="string">&quot;/download&quot;</span>, <span class="function"><span class="keyword">func</span><span class="params">(c *gin.Context)</span></span> &#123;</span><br><span class="line"><span class="comment">// ...</span></span><br><span class="line">c.FileAttachment(fileLocalPath, originalFilename)</span><br><span class="line">&#125;)</span><br></pre></td></tr></table></figure><p>其中 <code>fileLocalPath</code> 是文件本地目录，而 <code>originalFilename</code> 则是下载时要指定的文件名字。</p><p>当然不使用 <code>FileAttachment</code> 方法也可以，通过指定 <code>Content-Disposition</code> header 实现的</p><figure class="highlight go"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br></pre></td><td class="code"><pre><span class="line"><span class="function"><span class="keyword">func</span> <span class="params">(c *Context)</span></span> FileAttachment(filepath, filename <span class="type">string</span>) &#123;</span><br><span class="line"><span class="keyword">if</span> isASCII(filename) &#123;</span><br><span class="line">c.Writer.Header().Set(<span class="string">&quot;Content-Disposition&quot;</span>, <span class="string">`attachment; filename=&quot;`</span>+escapeQuotes(filename)+<span class="string">`&quot;`</span>)</span><br><span class="line">&#125; <span class="keyword">else</span> &#123;</span><br><span class="line">c.Writer.Header().Set(<span class="string">&quot;Content-Disposition&quot;</span>, <span class="string">`attachment; filename*=UTF-8&#x27;&#x27;`</span>+url.QueryEscape(filename))</span><br><span class="line">&#125;</span><br><span class="line">http.ServeFile(c.Writer, c.Request, filepath)</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure><p>然后指定 <code>attachment; filename*=UTF-8&#39;&#39;</code> + 文件名称就可以了</p>]]>
    </content>
    
    
      
      
        
        
    <summary type="html">&lt;h2 id=&quot;前言&quot;&gt;&lt;a href=&quot;#前言&quot; class=&quot;headerlink&quot;</summary>
        
      
    
    
    
    <category term="golang" scheme="https://www.linkinstars.com/categories/golang/"/>
    
    
  </entry>
  
  <entry>
    <title>Golang gin 返回 XML 的小坑点</title>
    <link href="https://www.linkinstars.com/post/3bbe0d95.html"/>
    <id>https://www.linkinstars.com/post/3bbe0d95.html</id>
    <published>2024-11-19T16:00:00.000Z</published>
    <updated>2024-12-11T08:48:45.929Z</updated>
    
    <content type="html">
    <![CDATA[<a href="https://www.linkinstars.com/post/3bbe0d95.html">RSS 阅读体验可能不太好，若喜欢本文请点此跳转原文查看~</a><br><br>]]>
    <![CDATA[<h2 id="前言"><a href="#前言" class="headerlink" title="前言"></a>前言</h2><p>现在绝大多数系统已经都使用 json 格式来返回和传输数据。但实际中还是有一些客户需要返回 XML 数据做处理，而在这不服处理过程中最近遇到了两个小问题：</p><ol><li>如何返回 xml 协议头？</li><li>如何返回小写的 <code>&lt;root&gt;</code> 节点？</li></ol><p>如果我们像 JSON 格式一样使用 <code>ctx.XML</code> 方法直接返回数据的话就会遇到这样的问题。所以具体使用场景中没有像 JSON 一样那么容易直接使用 <code>ctx.JSON</code> 就可以完成，还需要做一些额外的处理。</p><h2 id="如何返回协议头"><a href="#如何返回协议头" class="headerlink" title="如何返回协议头"></a>如何返回协议头</h2><p>第一个问题比较简单，就是返回的 xml 通常会有一个 header ，可能像下面这样</p><figure class="highlight xml"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line"><span class="meta">&lt;?xml version=<span class="string">&quot;1.0&quot;</span> encoding=<span class="string">&quot;UTF-8&quot;</span>?&gt;</span></span><br></pre></td></tr></table></figure><p>但其实默认直接返回是没有的，于是就需要主动去写一次，解决方式也非常简单</p><figure class="highlight go"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br></pre></td><td class="code"><pre><span class="line"><span class="keyword">import</span> (</span><br><span class="line">    <span class="string">&quot;encoding/xml&quot;</span></span><br><span class="line">)</span><br><span class="line"></span><br><span class="line"><span class="function"><span class="keyword">func</span> <span class="title">Controller</span><span class="params">(ctx *gin.Context)</span></span> &#123;</span><br><span class="line">    <span class="comment">// ...</span></span><br><span class="line">    ctx.Writer.Write([]<span class="type">byte</span>(xml.Header))</span><br><span class="line">    <span class="comment">// ...</span></span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure><p>xml 库中默认就呀这个 header，所以直接用就可以了。</p><h2 id="如何返回小写的-节点？"><a href="#如何返回小写的-节点？" class="headerlink" title="如何返回小写的 &lt;root&gt; 节点？"></a>如何返回小写的 <code>&lt;root&gt;</code> 节点？</h2><figure class="highlight go"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br><span class="line">29</span><br></pre></td><td class="code"><pre><span class="line"><span class="keyword">package</span> main</span><br><span class="line"></span><br><span class="line"><span class="keyword">import</span> (</span><br><span class="line">    <span class="string">&quot;net/http&quot;</span></span><br><span class="line"></span><br><span class="line"></span><br><span class="line">    <span class="string">&quot;github.com/gin-gonic/gin&quot;</span></span><br><span class="line">)</span><br><span class="line"></span><br><span class="line"><span class="keyword">type</span> Resp <span class="keyword">struct</span> &#123;</span><br><span class="line">    Code    <span class="type">string</span>   <span class="string">`xml:&quot;code&quot;`</span></span><br><span class="line">    Data    <span class="type">string</span>   <span class="string">`xml:&quot;data&quot;`</span></span><br><span class="line">&#125;</span><br><span class="line"></span><br><span class="line"><span class="function"><span class="keyword">func</span> <span class="title">main</span><span class="params">()</span></span> &#123;</span><br><span class="line">    r := gin.New()</span><br><span class="line">    r.GET(<span class="string">&quot;/hello&quot;</span>, <span class="function"><span class="keyword">func</span><span class="params">(ctx *gin.Context)</span></span> &#123;</span><br><span class="line">    resp := &amp;Resp&#123;</span><br><span class="line">    Data: <span class="string">&quot;aaa&quot;</span>,</span><br><span class="line">    Code: <span class="string">&quot;bbb&quot;</span>,</span><br><span class="line">    &#125;</span><br><span class="line">    ctx.XML(http.StatusOK, resp)</span><br><span class="line">    &#125;)</span><br><span class="line"></span><br><span class="line">    err := r.Run(<span class="string">&quot;:8080&quot;</span>)</span><br><span class="line">    <span class="keyword">if</span> err != <span class="literal">nil</span> &#123;</span><br><span class="line">    <span class="built_in">panic</span>(err)</span><br><span class="line">    &#125;</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure><p>如果按照上述代码编写返回的时候数据就是</p><figure class="highlight plaintext"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br></pre></td><td class="code"><pre><span class="line">curl http://127.0.0.1:8080/hello</span><br><span class="line">&lt;Resp&gt;&lt;code&gt;bbb&lt;/code&gt;&lt;data&gt;aaa&lt;/data&gt;&lt;/Resp&gt;</span><br></pre></td></tr></table></figure><p>可以看到最外层返回的节点是 struct 的名称，而通常这不是我们想要的，我们需要指定最外层的 root 节点的名称。</p><p>经过踩坑，发现在 golang 中，可以在通过 <code>xml.Name</code> 来实现。</p><figure class="highlight go"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br><span class="line">29</span><br><span class="line">30</span><br><span class="line">31</span><br></pre></td><td class="code"><pre><span class="line"><span class="keyword">package</span> main</span><br><span class="line"></span><br><span class="line"><span class="keyword">import</span> (</span><br><span class="line">    <span class="string">&quot;net/http&quot;</span></span><br><span class="line"></span><br><span class="line">    <span class="string">&quot;encoding/xml&quot;</span></span><br><span class="line"></span><br><span class="line">    <span class="string">&quot;github.com/gin-gonic/gin&quot;</span></span><br><span class="line">)</span><br><span class="line"></span><br><span class="line"><span class="keyword">type</span> Resp <span class="keyword">struct</span> &#123;</span><br><span class="line">    XMLName xml.Name <span class="string">`xml:&quot;root&quot;`</span></span><br><span class="line">    Code    <span class="type">string</span>   <span class="string">`xml:&quot;code&quot;`</span></span><br><span class="line">    Data    <span class="type">string</span>   <span class="string">`xml:&quot;data&quot;`</span></span><br><span class="line">&#125;</span><br><span class="line"></span><br><span class="line"><span class="function"><span class="keyword">func</span> <span class="title">main</span><span class="params">()</span></span> &#123;</span><br><span class="line">    r := gin.New()</span><br><span class="line">    r.GET(<span class="string">&quot;/hello&quot;</span>, <span class="function"><span class="keyword">func</span><span class="params">(ctx *gin.Context)</span></span> &#123;</span><br><span class="line">        resp := &amp;Resp&#123;</span><br><span class="line">            Data: <span class="string">&quot;aaa&quot;</span>,</span><br><span class="line">            Code: <span class="string">&quot;bbb&quot;</span>,</span><br><span class="line">        &#125;</span><br><span class="line">        ctx.XML(http.StatusOK, resp)</span><br><span class="line">    &#125;)</span><br><span class="line"></span><br><span class="line">    err := r.Run(<span class="string">&quot;:8080&quot;</span>)</span><br><span class="line">    <span class="keyword">if</span> err != <span class="literal">nil</span> &#123;</span><br><span class="line">        <span class="built_in">panic</span>(err)</span><br><span class="line">    &#125;</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure><p>加上之后，返回的内容就会变成</p><figure class="highlight bash"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br></pre></td><td class="code"><pre><span class="line">curl http://127.0.0.1:8080/hello</span><br><span class="line">&lt;root&gt;&lt;code&gt;bbb&lt;/code&gt;&lt;data&gt;aaa&lt;/data&gt;&lt;/root&gt;</span><br></pre></td></tr></table></figure>]]>
    </content>
    
    
      
      
        
        
    <summary type="html">&lt;h2 id=&quot;前言&quot;&gt;&lt;a href=&quot;#前言&quot; class=&quot;headerlink&quot; title=&quot;前言&quot;&gt;&lt;/a&gt;前言&lt;/h2&gt;&lt;p&gt;现在绝大多数系统已经都使用 json 格式来返回和传输数据。但实际中还是有一些客户需要返回 XML</summary>
        
      
    
    
    
    <category term="golang" scheme="https://www.linkinstars.com/categories/golang/"/>
    
    
    <category term="xml" scheme="https://www.linkinstars.com/tags/xml/"/>
    
  </entry>
  
  <entry>
    <title>Golang 快速接入两步验证 2FA</title>
    <link href="https://www.linkinstars.com/post/e68cb5e1.html"/>
    <id>https://www.linkinstars.com/post/e68cb5e1.html</id>
    <published>2024-11-09T16:00:00.000Z</published>
    <updated>2024-12-11T08:48:45.929Z</updated>
    
    <content type="html">
    <![CDATA[<a href="https://www.linkinstars.com/post/e68cb5e1.html">RSS 阅读体验可能不太好，若喜欢本文请点此跳转原文查看~</a><br><br>]]>
    <![CDATA[<h2 id="前言"><a href="#前言" class="headerlink" title="前言"></a>前言</h2><p>自从 GitHub 开启强制 2FA 之后，如今很多应用紧跟着接入了这个功能，两步验证，是当用户在输入密码之后，还需要输入一个一次性的验证码来进行额外的第二次验证。相比于国外环境，国内更喜欢也更普遍的是短信验证码，动不动就发一条短信。而 2FA 的强大在于是可以离线完成的，那么如何接入这个功能呢？其实了解过程和原理之后，实现是非常简单的。</p><h2 id="快速体验"><a href="#快速体验" class="headerlink" title="快速体验"></a>快速体验</h2><p>先二话不说，先体验一下最终实现的效果，利用 <code>github.com/xlzd/gotp</code> 我们可以非常容易的实现整个过程。</p><h3 id="准备一个-APP"><a href="#准备一个-APP" class="headerlink" title="准备一个 APP"></a>准备一个 APP</h3><p>首先你需要准备一个 2FA 的 APP，如：Google Authenticator, 1Password, Authy, Microsoft Authenticator 都可以</p><h3 id="运行并验证"><a href="#运行并验证" class="headerlink" title="运行并验证"></a>运行并验证</h3><figure class="highlight go"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br><span class="line">29</span><br><span class="line">30</span><br><span class="line">31</span><br><span class="line">32</span><br><span class="line">33</span><br></pre></td><td class="code"><pre><span class="line"><span class="keyword">package</span> main</span><br><span class="line"></span><br><span class="line"><span class="keyword">import</span> (</span><br><span class="line"><span class="string">&quot;bufio&quot;</span></span><br><span class="line"><span class="string">&quot;fmt&quot;</span></span><br><span class="line"><span class="string">&quot;os&quot;</span></span><br><span class="line"><span class="string">&quot;time&quot;</span></span><br><span class="line"></span><br><span class="line"><span class="string">&quot;github.com/skip2/go-qrcode&quot;</span></span><br><span class="line"><span class="string">&quot;github.com/xlzd/gotp&quot;</span></span><br><span class="line">)</span><br><span class="line"></span><br><span class="line"><span class="function"><span class="keyword">func</span> <span class="title">main</span><span class="params">()</span></span> &#123;</span><br><span class="line"><span class="comment">// 生成一个随机的密钥</span></span><br><span class="line">randomSecret := gotp.RandomSecret(<span class="number">16</span>)</span><br><span class="line">randomSecret = <span class="string">&quot;ENRVL5I4WPXURPIJRC7XZAI7U4&quot;</span> <span class="comment">// 此处为了测试方便固定密钥</span></span><br><span class="line"></span><br><span class="line"><span class="comment">// 生成二维码，此处需要提供用户名称和机构名称</span></span><br><span class="line">uri := gotp.NewDefaultTOTP(randomSecret).ProvisioningUri(<span class="string">&quot;user@linkinstars.com&quot;</span>, <span class="string">&quot;LinkinStar&quot;</span>)</span><br><span class="line">fmt.Println(uri)</span><br><span class="line">qrcode.WriteFile(uri, qrcode.Medium, <span class="number">256</span>, <span class="string">&quot;qr.png&quot;</span>)</span><br><span class="line"></span><br><span class="line">fmt.Print(<span class="string">&quot;扫描二维码后，请输入 APP 上的验证码：&quot;</span>)</span><br><span class="line">scanner := bufio.NewScanner(os.Stdin)</span><br><span class="line">scanner.Scan()</span><br><span class="line">userInput := scanner.Text()</span><br><span class="line"></span><br><span class="line"><span class="keyword">if</span> gotp.NewDefaultTOTP(randomSecret).Verify(userInput, time.Now().Unix()) &#123;</span><br><span class="line">fmt.Println(<span class="string">&quot;验证成功&quot;</span>)</span><br><span class="line">&#125; <span class="keyword">else</span> &#123;</span><br><span class="line">fmt.Println(<span class="string">&quot;验证失败&quot;</span>)</span><br><span class="line">&#125;</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure><p>运行上面的代码后会生成一个 qr.png 的二维码图片，用 APP 扫码后，输入验证码，验证通过即成功 ✌️</p><figure class="highlight bash"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br></pre></td><td class="code"><pre><span class="line">otpauth://totp/LinkinStar:user@linkinstars.com?issuer=LinkinStar&amp;secret=ENRVL5I4WPXURPIJRC7XZAI7U4</span><br><span class="line">扫描二维码后，请输入 APP 上的验证码：221456</span><br><span class="line">验证成功</span><br></pre></td></tr></table></figure><h2 id="实际使用"><a href="#实际使用" class="headerlink" title="实际使用"></a>实际使用</h2><h3 id="准备参数"><a href="#准备参数" class="headerlink" title="准备参数"></a>准备参数</h3><ul><li>用户的唯一标识 email，通常可以是邮箱</li><li>机构名称 appName，也就是分发这个 2FA 的应用厂商，通常是你应用的名称</li><li>一个随机密钥 secret，可以使用 <code>gotp.RandomSecret(16)</code> 生成，每个用户一个，<strong>注意存储</strong></li></ul><h3 id="过程"><a href="#过程" class="headerlink" title="过程"></a>过程</h3><p>用户初次绑定，使用上述三个参数生成一个 URI。</p><figure class="highlight go"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line">uri := gotp.NewDefaultTOTP(secret).ProvisioningUri(email, appName)</span><br></pre></td></tr></table></figure><p>内容大致为：<code>otpauth://totp/LinkinStar:user@linkinstars.com?issuer=LinkinStar&amp;secret=ENRVL5I4WPXURPIJRC7XZAI7U4</code></p><p>然后用这个 URI 生成一个二维码，供用户扫码绑定并输入验证码进行验证。</p><blockquote><p>注意，此处的二维码仅仅展示一次，一旦验证通过，如不必要不进行展示，再次展示也需要做验证</p></blockquote><p>最后验证是否正确即可</p><figure class="highlight go"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line">gotp.NewDefaultTOTP(randomSecret).Verify(userInput, time.Now().Unix())</span><br></pre></td></tr></table></figure><p>之后用户每次登录，需要验证 2FA 时，仅仅输入验证码，然后进行验证即可</p><h2 id="简述原理"><a href="#简述原理" class="headerlink" title="简述原理"></a>简述原理</h2><p>其本质是利用了 TOTP，全称为”基于时间的一次性密码”（Time-based One-time Password），已被纳入国际标准 RFC6238 中。</p><ol><li>用户开启双因素认证后，服务器生成一个密钥。</li><li>服务器要求用户扫描二维码或以其他方式将密钥保存到用户的手机，<strong>确保服务器和用户手机拥有相同的密钥</strong>。</li><li>用户登录时，手机客户端基于该密钥和当前时间戳生成一个哈希，<strong>该哈希在 30 秒内有效</strong>。用户需在有效期内将哈希提交给服务器。注意，<strong>密钥与用户手机绑定，更换手机时需要生成新密钥</strong>。</li><li>服务器也使用密钥和<strong>当前时间戳生成一个哈希</strong>，与用户提交的哈希进行比对。只有在两者一致时，用户才能成功登录。</li></ol><p>RFC6238 规定了以下实现细节：</p><ul><li>生成任意字节的密钥 K，并与客户端安全地共享。</li><li>基于 T0 协商后，Unix 时间从时间间隔（TI）开始计算时间步骤，TI 用于计算计数器 C（默认情况下，TI 的值为 T0 和 30 秒）。</li><li>协商加密哈希算法（默认为 SHA-1）。</li><li>协商密码长度（默认为 6 位）。</li></ul><p>所以原理其实说起来也简单，就是通过时间和密钥做了 hash，并且以 30 秒 为界限。故最重要的一点就是需要保证服务器时间和客户端时间是一致(几乎)的。当然，绝大多数情况下是一致的。</p><h2 id="最后注意点"><a href="#最后注意点" class="headerlink" title="最后注意点"></a>最后注意点</h2><p>最好需要设计一个恢复码，因为开启 2FA 之后没有手机在旁边的时候是无法使用的，万一手机因为意外丢失，则永远无法登录使用了。那么恢复码可以临时让我们使用并进入应用从而避免意外。其主要的作用是为了以防万一和将责任推给用户。</p>]]>
    </content>
    
    
      
      
        
        
    <summary type="html">&lt;h2 id=&quot;前言&quot;&gt;&lt;a href=&quot;#前言&quot; class=&quot;headerlink&quot; title=&quot;前言&quot;&gt;&lt;/a&gt;前言&lt;/h2&gt;&lt;p&gt;自从 GitHub 开启强制 2FA</summary>
        
      
    
    
    
    <category term="golang" scheme="https://www.linkinstars.com/categories/golang/"/>
    
    
    <category term="2fa" scheme="https://www.linkinstars.com/tags/2fa/"/>
    
  </entry>
  
  <entry>
    <title>使用 Rust 快速体验 eBPF</title>
    <link href="https://www.linkinstars.com/post/e3eaf4e6.html"/>
    <id>https://www.linkinstars.com/post/e3eaf4e6.html</id>
    <published>2024-10-31T16:00:00.000Z</published>
    <updated>2024-11-08T02:44:09.288Z</updated>
    
    <content type="html">
    <![CDATA[<a href="https://www.linkinstars.com/post/e3eaf4e6.html">RSS 阅读体验可能不太好，若喜欢本文请点此跳转原文查看~</a><br><br>]]>
    <![CDATA[<h2 id="前言"><a href="#前言" class="headerlink" title="前言"></a>前言</h2><p>可能你听过 eBPF 这个听上去非常高大上的技术，但实际并没有使用过。今天让我们来用 rust 快速体验一下 eBPF 的能力。</p><p>之所以使用 Rust 是因为 <a href="https://github.com/aya-rs/aya">aya</a> 框架确实很容易上手，当然如果使用 golang 也有 <a href="https://github.com/cilium/ebpf">https://github.com/cilium/ebpf</a> 的帮助也非常容易的。</p><p>因为只是体验，你可以几乎没有语言本身的使用经验，就能直接运行并感受到 eBPF 的能力。当然整体几乎也不需要写代码。</p><h2 id="环境准备"><a href="#环境准备" class="headerlink" title="环境准备"></a>环境准备</h2><ul><li>开发环境，如果直接在 Linux 环境上开发自然最好，博主使用的是 M 系列的 MacOS 环境，然后交叉编译到 Linux 上运行</li><li>运行环境，当然你需要一个 Linux 环境能运行，并且内核版本要足够，前提就是需要支持 eBPF 和 XDP 才可以</li></ul><h2 id="前置知识点"><a href="#前置知识点" class="headerlink" title="前置知识点"></a>前置知识点</h2><p>由于只是体验，我也不过多深入其中的原理和内容，就简单快速让你了解什么是 eBPF 和 XDP。详细原理可参考各种<a href="https://ebpf.io/zh-hans/what-is-ebpf/">文档</a>。</p><h3 id="eBPF"><a href="#eBPF" class="headerlink" title="eBPF"></a>eBPF</h3><blockquote><p><strong>在不更改内核源代码的情况下，向操作系统添加额外的功能</strong></p></blockquote><p>简单的来说就是通过一些钩子在运行一些调用方法的前后添加一些处理逻辑，以便实现一些功能。有点像 Java 的 AOP，又或者是像一种插件机制，总之我称为是一种动态扩展的能力。强大的地方在于，他是建立在内核上的，也就是说，它获取的都是最原始的信息，拿到了几乎无所不能（因为在系统底层，通常越底层能做的就越多也越复杂）。而还有一个强大的点是无侵入、或者说透明，因为对上层的系统来说是无感知的。</p><p><img src="https://blog.linkinstars.com/blog/ebpf-principle.png" alt="ebpf-principle"></p><p>上图就很清楚的描述了他的作用。</p><h3 id="XDP"><a href="#XDP" class="headerlink" title="XDP"></a>XDP</h3><p>我们知道了 eBPF 其实 XDP 就很容易理解，XDP 就是其中一种钩子，专门用来在网络数据包到达网卡驱动层时对其进行处理。XDP 全称是 eXpress Data Path，XDP 能够在数据包到达内核网络栈之前拦截并处理它们。关键就是快，网络包实在是太多了，所以性能非常的重要。</p><h2 id="安装开发环境依赖"><a href="#安装开发环境依赖" class="headerlink" title="安装开发环境依赖"></a>安装开发环境依赖</h2><p>话不多说，我们赶紧实际上手感受一下吧。首先是我们今天最困难的一个步骤，安装环境。由于环境各种各异，所以遇到的问题也各种各样，对于第一次上手的同学来说这或许就是最困难的一步了。</p><p>文档参考：<a href="https://aya-rs.dev/book/start/development/">https://aya-rs.dev/book/start/development/</a></p><h3 id="安装-rust-环境"><a href="#安装-rust-环境" class="headerlink" title="安装 rust 环境"></a>安装 rust 环境</h3><figure class="highlight bash"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br></pre></td><td class="code"><pre><span class="line">$ rustup install stable</span><br><span class="line">$ rustup toolchain install nightly --component rust-src</span><br></pre></td></tr></table></figure><h3 id="安装-bpf-linker"><a href="#安装-bpf-linker" class="headerlink" title="安装 bpf-linker"></a>安装 bpf-linker</h3><p>如果是 x86 的 linux 可以直接装</p><figure class="highlight bash"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line">$ cargo install bpf-linker</span><br></pre></td></tr></table></figure><p>由于我是 macOS 环境，使用下面的命令先安装 llvm 再进行安装</p><figure class="highlight bash"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br></pre></td><td class="code"><pre><span class="line">$ brew install llvm</span><br><span class="line">$ LLVM_SYS_180_PREFIX=$(brew --prefix llvm) cargo install --no-default-features bpf-linker</span><br></pre></td></tr></table></figure><p>注意此处可能会出现各种问题，由于之前的环境会有影响，总是就是如果有提示报错，不要由于一定是少了什么，直接安装提示安装对应没有的命令和依赖就可以了。</p><h3 id="安装-cargo-generate"><a href="#安装-cargo-generate" class="headerlink" title="安装 cargo-generate"></a>安装 cargo-generate</h3><figure class="highlight bash"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line">$ cargo install cargo-generate</span><br></pre></td></tr></table></figure><p>至此基本环境到位，后面还需安装一些交叉编译所需的工具。</p><h2 id="构建第一个-eBPF-程序"><a href="#构建第一个-eBPF-程序" class="headerlink" title="构建第一个 eBPF 程序"></a>构建第一个 eBPF 程序</h2><p>我们使用官方推荐的例子程序直接使用模板进行构建，构建一个 xdp 的 eBPF 程序，用于监听网络包，这也是我们非常用的一个使用场景了</p><figure class="highlight bash"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line">$ cargo generate --name myapp -d program_type=xdp https://github.com/aya-rs/aya-template</span><br></pre></td></tr></table></figure><p>执行这个命令就会按照模板创建以嗯 myapp 的 eBPF 应用程序，然后安装一些交叉编译所需的工具</p><figure class="highlight bash"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line">$ brew install filosottile/musl-cross/musl-cross</span><br></pre></td></tr></table></figure><p>然后以目标设备 x86 举例</p><figure class="highlight bash"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br></pre></td><td class="code"><pre><span class="line">$ <span class="built_in">export</span> ARCH=x86_64</span><br><span class="line">$ rustup target add <span class="variable">$&#123;ARCH&#125;</span>-unknown-linux-musl</span><br></pre></td></tr></table></figure><p>这一部分在生成的 README 中有，如果不清楚可以直接查看。</p><h2 id="编译并运行调试"><a href="#编译并运行调试" class="headerlink" title="编译并运行调试"></a>编译并运行调试</h2><p>最后二话不说，直接编译</p><figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br></pre></td><td class="code"><pre><span class="line"><span class="meta prompt_">$ </span><span class="language-bash"><span class="built_in">export</span> ARCH=x86_64</span></span><br><span class="line"><span class="meta prompt_">$ </span><span class="language-bash">CC=<span class="variable">$&#123;ARCH&#125;</span>-linux-musl-gcc cargo build --package myapp --release \</span></span><br><span class="line"><span class="language-bash">  --target=<span class="variable">$&#123;ARCH&#125;</span>-unknown-linux-musl \</span></span><br><span class="line"><span class="language-bash">  --config=target.<span class="variable">$&#123;ARCH&#125;</span>-unknown-linux-musl.linker=\&quot;<span class="variable">$&#123;ARCH&#125;</span>-linux-musl-gcc\&quot;</span></span><br></pre></td></tr></table></figure><p>如果没问题，就会在 <code>myapp/target/x86_64-unknown-linux-musl/release</code> 文件夹下生成一个 myapp 的文件就成功了，然后 scp 到目标 Linux 设备上进行调试。</p><figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line"><span class="meta prompt_">$ </span><span class="language-bash">RUST_LOG=info ./myapp --iface lo</span></span><br></pre></td></tr></table></figure><p>这里一定注意需要先指定 LOG 等级，以便能看到日志输出，然后 <code>--iface lo</code> 是指监听本地的回环网卡默认不写是 eth0，这里我们测试所以写 lo 其实也就是我常说的 127.0.0.1 啦，新开一个窗口 ping 测试</p><figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line"><span class="meta prompt_">$ </span><span class="language-bash">ping -c 1 127.0.0.1</span></span><br></pre></td></tr></table></figure><p>然后你就能看到输出的日志 <code>received a packet</code> 了</p><figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br></pre></td><td class="code"><pre><span class="line">RUST_LOG=info ./myapp --iface lo</span><br><span class="line">Waiting for Ctrl-C...</span><br><span class="line">[INFO  myapp] received a packet</span><br><span class="line">[INFO  myapp] received a packet</span><br></pre></td></tr></table></figure><p>输出这个代表了什么意思呢？其实就相当于已经拿到了我们网络包了。</p><h2 id="尝试修改"><a href="#尝试修改" class="headerlink" title="尝试修改"></a>尝试修改</h2><p>大致浏览代码结构，囫囵吞枣就可以，反正第一次看也看不明白(手动狗头)。首先我们直接搜代码 <code>received a packet</code> 。</p><p>在 <code>myapp/myapp-ebpf/src/main.rs</code> 文件中你可以看到如下代码：</p><figure class="highlight rust"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br></pre></td><td class="code"><pre><span class="line"><span class="meta">#[xdp]</span></span><br><span class="line"><span class="keyword">pub</span> <span class="keyword">fn</span> <span class="title function_">myapp</span>(ctx: XdpContext) <span class="punctuation">-&gt;</span> <span class="type">u32</span> &#123;</span><br><span class="line">    <span class="keyword">match</span> <span class="title function_ invoke__">try_myapp</span>(ctx) &#123;</span><br><span class="line">        <span class="title function_ invoke__">Ok</span>(ret) =&gt; ret,</span><br><span class="line">        <span class="title function_ invoke__">Err</span>(_) =&gt; xdp_action::XDP_ABORTED,</span><br><span class="line">    &#125;</span><br><span class="line">&#125;</span><br><span class="line"></span><br><span class="line"><span class="keyword">fn</span> <span class="title function_">try_myapp</span>(ctx: XdpContext) <span class="punctuation">-&gt;</span> <span class="type">Result</span>&lt;<span class="type">u32</span>, <span class="type">u32</span>&gt; &#123;</span><br><span class="line">    info!(&amp;ctx, <span class="string">&quot;received a packet&quot;</span>);</span><br><span class="line">    <span class="title function_ invoke__">Ok</span>(xdp_action::XDP_PASS)</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure><p>很显然，<code>myapp</code> 直接去调用了 <code>try_myapp</code> 而 <code>try_myapp</code> 输出了日志，也就是拿到了网络包，可想而知在这个方法里面就能拿到网络包的相关数据了。也就是入参 <code>XdpContext</code>。那我们稍作修改，简单解析一下试试看。</p><p>首先解释一下 <code>myapp</code> 其实就是看 <code>try_myapp</code> 返回的是什么，如果是返回了错误，那么直接就 ABORTED 了这个包，也就是直接将这个包给扔掉了。其他情况就是返回什么就是什么。</p><h3 id="解析-XdpContext"><a href="#解析-XdpContext" class="headerlink" title="解析 XdpContext"></a>解析 XdpContext</h3><blockquote><p>考验你计算机网络基础知识是否扎实的挑战来了，还好我之前写过 <a href="https://www.linkinstars.com/post/5c140909.html">回首网络知识之 TCP 协议</a>，里面有报文格式的图片可以供你回忆</p></blockquote><p>其实 ip 报文最关键的信息就前面，所以我们就尝试修改拿到一下源 ip + 端口这个信息</p><figure class="highlight rust"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br><span class="line">29</span><br><span class="line">30</span><br><span class="line">31</span><br><span class="line">32</span><br><span class="line">33</span><br><span class="line">34</span><br><span class="line">35</span><br><span class="line">36</span><br><span class="line">37</span><br><span class="line">38</span><br><span class="line">39</span><br><span class="line">40</span><br><span class="line">41</span><br><span class="line">42</span><br><span class="line">43</span><br><span class="line">44</span><br><span class="line">45</span><br><span class="line">46</span><br><span class="line">47</span><br><span class="line">48</span><br><span class="line">49</span><br><span class="line">50</span><br><span class="line">51</span><br><span class="line">52</span><br><span class="line">53</span><br><span class="line">54</span><br><span class="line">55</span><br><span class="line">56</span><br><span class="line">57</span><br><span class="line">58</span><br><span class="line">59</span><br><span class="line">60</span><br><span class="line">61</span><br><span class="line">62</span><br><span class="line">63</span><br><span class="line">64</span><br><span class="line">65</span><br></pre></td><td class="code"><pre><span class="line"><span class="keyword">use</span> network_types::&#123;</span><br><span class="line">    eth::&#123;EthHdr, EtherType&#125;,</span><br><span class="line">    ip::&#123;IpProto, Ipv4Hdr&#125;,</span><br><span class="line">    tcp::TcpHdr,</span><br><span class="line">    udp::UdpHdr,</span><br><span class="line">&#125;;</span><br><span class="line"></span><br><span class="line"><span class="meta">#[xdp]</span></span><br><span class="line"><span class="keyword">pub</span> <span class="keyword">fn</span> <span class="title function_">myapp</span>(ctx: XdpContext) <span class="punctuation">-&gt;</span> <span class="type">u32</span> &#123;</span><br><span class="line">    <span class="keyword">match</span> <span class="title function_ invoke__">try_myapp</span>(ctx) &#123;</span><br><span class="line">        <span class="title function_ invoke__">Ok</span>(ret) =&gt; ret,</span><br><span class="line">        <span class="title function_ invoke__">Err</span>(_) =&gt; xdp_action::XDP_ABORTED,</span><br><span class="line">    &#125;</span><br><span class="line">&#125;</span><br><span class="line"></span><br><span class="line"><span class="keyword">fn</span> <span class="title function_">try_myapp</span>(ctx: XdpContext) <span class="punctuation">-&gt;</span> <span class="type">Result</span>&lt;<span class="type">u32</span>, ()&gt; &#123;</span><br><span class="line">    info!(&amp;ctx, <span class="string">&quot;received a packet&quot;</span>);</span><br><span class="line"></span><br><span class="line">    <span class="comment">// 获取以太网报文头部</span></span><br><span class="line">    <span class="keyword">let</span> <span class="variable">ethhdr</span>: *<span class="keyword">const</span> EthHdr = <span class="title function_ invoke__">ptr_at</span>(&amp;ctx, <span class="number">0</span>)?;</span><br><span class="line">    <span class="comment">// 如果不是IPv4报文，直接放行</span></span><br><span class="line">    <span class="keyword">match</span> <span class="keyword">unsafe</span> &#123; (*ethhdr).ether_type &#125; &#123;</span><br><span class="line">        EtherType::Ipv4 =&gt; &#123;&#125;</span><br><span class="line">        _ =&gt; <span class="keyword">return</span> <span class="title function_ invoke__">Ok</span>(xdp_action::XDP_PASS),</span><br><span class="line">    &#125;</span><br><span class="line"></span><br><span class="line">    <span class="comment">// 获取IPv4报文头部</span></span><br><span class="line">    <span class="keyword">let</span> <span class="variable">ipv4hdr</span>: *<span class="keyword">const</span> Ipv4Hdr = <span class="title function_ invoke__">ptr_at</span>(&amp;ctx, EthHdr::LEN)?;</span><br><span class="line"></span><br><span class="line">    <span class="comment">// 获取源IP地址</span></span><br><span class="line">    <span class="keyword">let</span> <span class="variable">source_addr</span> = <span class="type">u32</span>::<span class="title function_ invoke__">from_be</span>(<span class="keyword">unsafe</span> &#123; (*ipv4hdr).src_addr &#125;);</span><br><span class="line">    <span class="comment">// 获取源端口</span></span><br><span class="line">    <span class="keyword">let</span> <span class="variable">source_port</span> = <span class="keyword">match</span> <span class="keyword">unsafe</span> &#123; (*ipv4hdr).proto &#125; &#123;</span><br><span class="line">        IpProto::Tcp =&gt; &#123;</span><br><span class="line">            <span class="keyword">let</span> <span class="variable">tcphdr</span>: *<span class="keyword">const</span> TcpHdr = <span class="title function_ invoke__">ptr_at</span>(&amp;ctx, EthHdr::LEN + Ipv4Hdr::LEN)?;</span><br><span class="line">            <span class="type">u16</span>::<span class="title function_ invoke__">from_be</span>(<span class="keyword">unsafe</span> &#123; (*tcphdr).source &#125;)</span><br><span class="line">        &#125;</span><br><span class="line">        IpProto::Udp =&gt; &#123;</span><br><span class="line">            <span class="keyword">let</span> <span class="variable">udphdr</span>: *<span class="keyword">const</span> UdpHdr = <span class="title function_ invoke__">ptr_at</span>(&amp;ctx, EthHdr::LEN + Ipv4Hdr::LEN)?;</span><br><span class="line">            <span class="type">u16</span>::<span class="title function_ invoke__">from_be</span>(<span class="keyword">unsafe</span> &#123; (*udphdr).source &#125;)</span><br><span class="line">        &#125;</span><br><span class="line">        <span class="comment">// ICMP报文打印日志并直接丢弃</span></span><br><span class="line">        IpProto::Icmp =&gt; &#123;</span><br><span class="line">            info!(&amp;ctx, <span class="string">&quot;ICMP packet&quot;</span>);</span><br><span class="line">            <span class="keyword">return</span> <span class="title function_ invoke__">Err</span>(());</span><br><span class="line">        &#125;</span><br><span class="line">        _ =&gt; <span class="keyword">return</span> <span class="title function_ invoke__">Err</span>(()),</span><br><span class="line">    &#125;;</span><br><span class="line"></span><br><span class="line">    info!(&amp;ctx, <span class="string">&quot;SRC IP: &#123;:i&#125;, SRC PORT: &#123;&#125;&quot;</span>, source_addr, source_port);</span><br><span class="line">    <span class="title function_ invoke__">Ok</span>(xdp_action::XDP_PASS)</span><br><span class="line">&#125;</span><br><span class="line"></span><br><span class="line"><span class="meta">#[inline(always)]</span></span><br><span class="line"><span class="keyword">fn</span> <span class="title function_">ptr_at</span>&lt;T&gt;(ctx: &amp;XdpContext, offset: <span class="type">usize</span>) <span class="punctuation">-&gt;</span> <span class="type">Result</span>&lt;*<span class="keyword">const</span> T, ()&gt; &#123;</span><br><span class="line">    <span class="keyword">let</span> <span class="variable">start</span> = ctx.<span class="title function_ invoke__">data</span>();</span><br><span class="line">    <span class="keyword">let</span> <span class="variable">end</span> = ctx.<span class="title function_ invoke__">data_end</span>();</span><br><span class="line">    <span class="keyword">let</span> <span class="variable">len</span> = mem::size_of::&lt;T&gt;();</span><br><span class="line"></span><br><span class="line">    <span class="keyword">if</span> start + offset + len &gt; end &#123;</span><br><span class="line">        <span class="keyword">return</span> <span class="title function_ invoke__">Err</span>(());</span><br><span class="line">    &#125;</span><br><span class="line"></span><br><span class="line">    <span class="title function_ invoke__">Ok</span>((start + offset) <span class="keyword">as</span> *<span class="keyword">const</span> T)</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure><p>这里我们做了两件重要的事情：</p><ol><li>解析报文并拿到源 IP 和端口</li><li>拒绝掉 ICMP 报文也就 ping 请求</li></ol><p>这其实是我们最常用的两个基本场景，一个就是判断来源，一个就是拦截报文。你可以想到，如果你想要封禁一些 ip 的访问，你就可以直接在这里做判断，并拦截直接丢弃，太强了。</p><p>注意：其中由于用到了 <code>network_types</code> 需要 add 一下，在 <code>myapp-ebpf</code> ，目录下执行下 <code>cargo add network-types</code>，然后重新编译并运行试试看。</p><h3 id="调试看看"><a href="#调试看看" class="headerlink" title="调试看看"></a>调试看看</h3><figure class="highlight bash"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br></pre></td><td class="code"><pre><span class="line">$ RUST_LOG=info ./myapp --iface lo</span><br><span class="line">Waiting <span class="keyword">for</span> Ctrl-C...</span><br><span class="line">[INFO  myapp] received a packet</span><br><span class="line">[INFO  myapp] ICMP packet</span><br><span class="line">[INFO  myapp] received a packet</span><br><span class="line">[INFO  myapp] SRC IP: 127.0.0.1, SRC PORT: 41102</span><br><span class="line">[INFO  myapp] received a packet</span><br><span class="line">[INFO  myapp] SRC IP: 127.0.0.1, SRC PORT: 9090</span><br></pre></td></tr></table></figure><figure class="highlight bash"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br></pre></td><td class="code"><pre><span class="line">$ ping -c 1 127.0.0.1</span><br><span class="line">PING 127.0.0.1 (127.0.0.1) 56(84) bytes of data.</span><br><span class="line"></span><br><span class="line">--- 127.0.0.1 ping statistics ---</span><br><span class="line">1 packets transmitted, 0 received, 100% packet loss, <span class="keyword">time</span> 0m</span><br><span class="line"></span><br><span class="line">$ curl http://127.0.0.1:9090</span><br></pre></td></tr></table></figure><p>可以看到使用 ping 的时候直接就丢包了，而使用 http 情况的时候拿到了 源 ip+端口，非常棒棒。</p><h2 id="总结"><a href="#总结" class="headerlink" title="总结"></a>总结</h2><p>整体体验下来，你一定会认为我说的没错，除了一开始安装环境很麻烦，本身其实非常容易上手，由于整体框架已经很容易做开发了，简单的改改就能实现你的需求。通过这次你一定能快速体验 eBPF 究竟是什么样一个东西 XDP 到底厉害在哪里。有了这样的体验之后，我相信你再去深入学习会事半功倍。</p><p>有需要其他的 aya 使用案例可以直接参考官方的其他例子，也非常容易上手。<a href="https://github.com/aya-rs/book/tree/main/examples">https://github.com/aya-rs/book/tree/main/examples</a></p>]]>
    </content>
    
    
      
      
        
        
    <summary type="html">&lt;h2 id=&quot;前言&quot;&gt;&lt;a href=&quot;#前言&quot; class=&quot;headerlink&quot; title=&quot;前言&quot;&gt;&lt;/a&gt;前言&lt;/h2&gt;&lt;p&gt;可能你听过 eBPF 这个听上去非常高大上的技术，但实际并没有使用过。今天让我们来用 rust 快速体验一下 eBPF</summary>
        
      
    
    
    
    <category term="rust" scheme="https://www.linkinstars.com/categories/rust/"/>
    
    
    <category term="eBPF" scheme="https://www.linkinstars.com/tags/eBPF/"/>
    
    <category term="aya" scheme="https://www.linkinstars.com/tags/aya/"/>
    
  </entry>
  
  <entry>
    <title>《一起读 kubernetes 源码》Service 创建之后发生了什么？</title>
    <link href="https://www.linkinstars.com/post/da328e81.html"/>
    <id>https://www.linkinstars.com/post/da328e81.html</id>
    <published>2024-10-14T16:00:00.000Z</published>
    <updated>2024-11-01T09:05:35.112Z</updated>
    
    <content type="html">
    <![CDATA[<a href="https://www.linkinstars.com/post/da328e81.html">RSS 阅读体验可能不太好，若喜欢本文请点此跳转原文查看~</a><br><br>]]>
    <![CDATA[<blockquote><p>📢 注意，该文本非最终版本，正在更新中，版权所有，请勿转载！！</p></blockquote><h2 id="前言"><a href="#前言" class="headerlink" title="前言"></a>前言</h2><p>在了解了基础的 pod 再到常用的 deployment，我们对于应用常用的 k8s 中对象应该已经了一个比较清晰的认识。对内没有问题之后，让我们来看看对外。要想让你部署的服务能被外部访问到，那么离不开的就是 service，也是我们最常见到的第一个有关与外部访问的对象了。所以在这一章的第一节我们先来看看 Service 是如何实现的。</p><p>在看源码之前问自己一个问题，为什么需要 service 呢？其实能想到的重要原因有两个：一个是默认情况下，容器没办法直接被外部访问到，就像我们使用 docker 一样，如果不绑定宿主机的端口是没办法为外部服务的；另一个就是负载均衡，因为我们的 pod 常常是有多个的，并且关键的是 pod 还会可能分在不同的机器上，如果没有一个合理的策略去将外部的流量转到对应的服务上就如同没有了导航方向标。</p><h2 id="前置知识"><a href="#前置知识" class="headerlink" title="前置知识"></a>前置知识</h2><ol><li>Service 的基本使用</li><li>Service 的类型</li></ol><h2 id="心路历程"><a href="#心路历程" class="headerlink" title="心路历程"></a>心路历程</h2><p>这里看源码很容易被陷入进去，由于 Service 功能并没有那么的直接涉及的细节和点(技术方案)很多，如果直接搜索，然后看名字然后一个个看下来容易迷茫并且很难串起来。让我们回到第一章第一节看源码的时候，重新出发。抛开所有我们暂时不关心的地方，仅仅看主线，所以我们应该关注什么呢？从类型来看 ClusterIP、NodePort、LoadBalancer 这几种，ClusterIP 是用来内部访问的，NodePort 将绑定每个节点的一个端口来访问。我觉得这两个应该是我们最先接触 k8s 用到的，一个用于服务与服务之间的访问，一个用于外部访问做测试，内网或者本地测试经常用到。</p><p>而我们先不聚焦使用，或者说原理，第一节我们就最基本的看下 Service 这个对象是如何被创建的，创建了之后做了什么。有经验的同学可能了解并知道 kube-proxy 以及 iptables、ipvs 等等，那么 service 的创建之后会对他们有什么影响吗？</p><p>这下其实就变得非常简单的，只要找到 Service 对象和创建、控制的过程就可以了。没错，这一节我们就仅仅看这么多，所以特别简单不用担心。</p><h2 id="码前提问"><a href="#码前提问" class="headerlink" title="码前提问"></a>码前提问</h2><ol><li>Service 创建之后做了些什么？</li></ol><h2 id="源码分析"><a href="#源码分析" class="headerlink" title="源码分析"></a>源码分析</h2><h3 id="数据结构"><a href="#数据结构" class="headerlink" title="数据结构"></a>数据结构</h3><p>看源码的方式还是一样的，首先我们最容易的也是最熟悉的是 Service 的数据结构，也就是我们常常看到的 yaml 文件</p><figure class="highlight yaml"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br></pre></td><td class="code"><pre><span class="line"><span class="attr">apiVersion:</span> <span class="string">v1</span></span><br><span class="line"><span class="attr">kind:</span> <span class="string">Service</span></span><br><span class="line"><span class="attr">spec:</span></span><br><span class="line">  <span class="attr">type:</span> <span class="string">ClusterIP</span></span><br><span class="line">  <span class="attr">ports:</span></span><br><span class="line">    <span class="bullet">-</span> <span class="attr">protocol:</span> <span class="string">TCP</span></span><br><span class="line">      <span class="attr">port:</span> <span class="number">80</span></span><br><span class="line">      <span class="attr">targetPort:</span> <span class="number">80</span></span><br><span class="line">  <span class="attr">clusterIP:</span> <span class="string">x.x.x.x</span></span><br></pre></td></tr></table></figure><p>只要他是一个对象，那么在源码里面一定是一个 struct 的。</p><figure class="highlight go"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment">// staging/src/k8s.io/api/core/v1/types.go:5049</span></span><br><span class="line"><span class="comment">// vendor/k8s.io/api/core/v1/types.go:5049 （也是一样的）</span></span><br><span class="line"><span class="keyword">type</span> Service <span class="keyword">struct</span> &#123;</span><br><span class="line">metav1.TypeMeta <span class="string">`json:&quot;,inline&quot;`</span></span><br><span class="line">metav1.ObjectMeta <span class="string">`json:&quot;metadata,omitempty&quot; protobuf:&quot;bytes,1,opt,name=metadata&quot;`</span></span><br><span class="line">Spec ServiceSpec <span class="string">`json:&quot;spec,omitempty&quot; protobuf:&quot;bytes,2,opt,name=spec&quot;`</span></span><br><span class="line">Status ServiceStatus <span class="string">`json:&quot;status,omitempty&quot; protobuf:&quot;bytes,3,opt,name=status&quot;`</span></span><br><span class="line">&#125;</span><br><span class="line"></span><br><span class="line"><span class="comment">// staging/src/k8s.io/api/core/v1/types.go:4743</span></span><br><span class="line"><span class="keyword">type</span> ServiceSpec <span class="keyword">struct</span> &#123;</span><br><span class="line">Ports []ServicePort <span class="string">`json:&quot;ports,omitempty&quot; patchStrategy:&quot;merge&quot; patchMergeKey:&quot;port&quot; protobuf:&quot;bytes,1,rep,name=ports&quot;`</span></span><br><span class="line">Selector <span class="keyword">map</span>[<span class="type">string</span>]<span class="type">string</span> <span class="string">`json:&quot;selector,omitempty&quot; protobuf:&quot;bytes,2,rep,name=selector&quot;`</span></span><br><span class="line">ClusterIP <span class="type">string</span> <span class="string">`json:&quot;clusterIP,omitempty&quot; protobuf:&quot;bytes,3,opt,name=clusterIP&quot;`</span></span><br><span class="line">ClusterIPs []<span class="type">string</span> <span class="string">`json:&quot;clusterIPs,omitempty&quot; protobuf:&quot;bytes,18,opt,name=clusterIPs&quot;`</span></span><br><span class="line">Type ServiceType <span class="string">`json:&quot;type,omitempty&quot; protobuf:&quot;bytes,4,opt,name=type,casttype=ServiceType&quot;`</span></span><br><span class="line"><span class="comment">//...</span></span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure><p>显然这个结构就是我们常常见到 yaml 的对应了</p><h3 id="控制器"><a href="#控制器" class="headerlink" title="控制器"></a>控制器</h3><p>有了之前第二章的经验，我们知道，在 k8s 中往往对象的控制都是通过一个控制器(controller)去控制的，并且控制的方式也都是通过 <code>Informer</code> 的方式，Service 也不例外，不过它不太好找。之前我们看 <code>Deployment</code> 那我们去找 <code>DeploymentController</code> 对吧，但我们这次可没找到一个叫做 <code>ServiceController</code> 的东西。并且你搜 <code>ServiceController</code> 容易误区找到其他一个对象。而它其实称为 <code>EndpointController</code> ，我们可以通过 <code>ServiceInformer</code> 找到它。先来看看 <code>NewEndpointController</code> 也就是创建的过程。</p><figure class="highlight go"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br><span class="line">29</span><br><span class="line">30</span><br><span class="line">31</span><br><span class="line">32</span><br><span class="line">33</span><br><span class="line">34</span><br><span class="line">35</span><br><span class="line">36</span><br><span class="line">37</span><br><span class="line">38</span><br><span class="line">39</span><br><span class="line">40</span><br><span class="line">41</span><br><span class="line">42</span><br><span class="line">43</span><br><span class="line">44</span><br><span class="line">45</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment">// pkg/controller/endpoint/endpoints_controller.go:73</span></span><br><span class="line"><span class="function"><span class="keyword">func</span> <span class="title">NewEndpointController</span><span class="params">(podInformer coreinformers.PodInformer, serviceInformer coreinformers.ServiceInformer,</span></span></span><br><span class="line"><span class="params"><span class="function">endpointsInformer coreinformers.EndpointsInformer, client clientset.Interface, endpointUpdatesBatchPeriod time.Duration)</span></span> *Controller &#123;</span><br><span class="line">broadcaster := record.NewBroadcaster()</span><br><span class="line">recorder := broadcaster.NewRecorder(scheme.Scheme, v1.EventSource&#123;Component: <span class="string">&quot;endpoint-controller&quot;</span>&#125;)</span><br><span class="line"></span><br><span class="line">e := &amp;Controller&#123;</span><br><span class="line">client:           client,</span><br><span class="line">queue:            workqueue.NewNamedRateLimitingQueue(workqueue.DefaultControllerRateLimiter(), <span class="string">&quot;endpoint&quot;</span>),</span><br><span class="line">workerLoopPeriod: time.Second,</span><br><span class="line">&#125;</span><br><span class="line"></span><br><span class="line">serviceInformer.Informer().AddEventHandler(cache.ResourceEventHandlerFuncs&#123;</span><br><span class="line">AddFunc: e.onServiceUpdate,</span><br><span class="line">UpdateFunc: <span class="function"><span class="keyword">func</span><span class="params">(old, cur <span class="keyword">interface</span>&#123;&#125;)</span></span> &#123;</span><br><span class="line">e.onServiceUpdate(cur)</span><br><span class="line">&#125;,</span><br><span class="line">DeleteFunc: e.onServiceDelete,</span><br><span class="line">&#125;)</span><br><span class="line">e.serviceLister = serviceInformer.Lister()</span><br><span class="line">e.servicesSynced = serviceInformer.Informer().HasSynced</span><br><span class="line"></span><br><span class="line">podInformer.Informer().AddEventHandler(cache.ResourceEventHandlerFuncs&#123;</span><br><span class="line">AddFunc:    e.addPod,</span><br><span class="line">UpdateFunc: e.updatePod,</span><br><span class="line">DeleteFunc: e.deletePod,</span><br><span class="line">&#125;)</span><br><span class="line">e.podLister = podInformer.Lister()</span><br><span class="line">e.podsSynced = podInformer.Informer().HasSynced</span><br><span class="line"></span><br><span class="line">endpointsInformer.Informer().AddEventHandler(cache.ResourceEventHandlerFuncs&#123;</span><br><span class="line">DeleteFunc: e.onEndpointsDelete,</span><br><span class="line">&#125;)</span><br><span class="line">e.endpointsLister = endpointsInformer.Lister()</span><br><span class="line">e.endpointsSynced = endpointsInformer.Informer().HasSynced</span><br><span class="line"></span><br><span class="line">e.staleEndpointsTracker = newStaleEndpointsTracker()</span><br><span class="line">e.triggerTimeTracker = endpointsliceutil.NewTriggerTimeTracker()</span><br><span class="line">e.eventBroadcaster = broadcaster</span><br><span class="line">e.eventRecorder = recorder</span><br><span class="line"></span><br><span class="line">e.endpointUpdatesBatchPeriod = endpointUpdatesBatchPeriod</span><br><span class="line"></span><br><span class="line"><span class="keyword">return</span> e</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure><p>无须细看，这套路我们再熟悉不过了，和之前一样的模式，Informer、Event 我们关注的 Service 就在里面。</p><figure class="highlight go"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br></pre></td><td class="code"><pre><span class="line">serviceInformer.Informer().AddEventHandler(cache.ResourceEventHandlerFuncs&#123;</span><br><span class="line">    AddFunc: e.onServiceUpdate,</span><br><span class="line">    UpdateFunc: <span class="function"><span class="keyword">func</span><span class="params">(old, cur <span class="keyword">interface</span>&#123;&#125;)</span></span> &#123;</span><br><span class="line">        e.onServiceUpdate(cur)</span><br><span class="line">    &#125;,</span><br><span class="line">    DeleteFunc: e.onServiceDelete,</span><br><span class="line">&#125;)</span><br><span class="line">e.serviceLister = serviceInformer.Lister()</span><br><span class="line">e.servicesSynced = serviceInformer.Informer().HasSynced</span><br></pre></td></tr></table></figure><p>在 <code>onServiceUpdate</code> 里面就只是简单入队而已，然后依旧和之前一样，还是看它的 <code>Run</code> 方法，然后通过启动多个 worker 执行 <code>worker</code> 方法就</p><figure class="highlight go"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br><span class="line">29</span><br><span class="line">30</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment">// pkg/controller/endpoint/endpoints_controller.go:166</span></span><br><span class="line"><span class="function"><span class="keyword">func</span> <span class="params">(e *Controller)</span></span> Run(ctx context.Context, workers <span class="type">int</span>) &#123;</span><br><span class="line"><span class="keyword">defer</span> utilruntime.HandleCrash()</span><br><span class="line"></span><br><span class="line"><span class="comment">// Start events processing pipeline.</span></span><br><span class="line">e.eventBroadcaster.StartStructuredLogging(<span class="number">0</span>)</span><br><span class="line">e.eventBroadcaster.StartRecordingToSink(&amp;v1core.EventSinkImpl&#123;Interface: e.client.CoreV1().Events(<span class="string">&quot;&quot;</span>)&#125;)</span><br><span class="line"><span class="keyword">defer</span> e.eventBroadcaster.Shutdown()</span><br><span class="line"></span><br><span class="line"><span class="keyword">defer</span> e.queue.ShutDown()</span><br><span class="line"></span><br><span class="line">logger := klog.FromContext(ctx)</span><br><span class="line">logger.Info(<span class="string">&quot;Starting endpoint controller&quot;</span>)</span><br><span class="line"><span class="keyword">defer</span> logger.Info(<span class="string">&quot;Shutting down endpoint controller&quot;</span>)</span><br><span class="line"></span><br><span class="line"><span class="keyword">if</span> !cache.WaitForNamedCacheSync(<span class="string">&quot;endpoint&quot;</span>, ctx.Done(), e.podsSynced, e.servicesSynced, e.endpointsSynced) &#123;</span><br><span class="line"><span class="keyword">return</span></span><br><span class="line">&#125;</span><br><span class="line"></span><br><span class="line"><span class="keyword">for</span> i := <span class="number">0</span>; i &lt; workers; i++ &#123;</span><br><span class="line"><span class="keyword">go</span> wait.UntilWithContext(ctx, e.worker, e.workerLoopPeriod)</span><br><span class="line">&#125;</span><br><span class="line"></span><br><span class="line"><span class="keyword">go</span> <span class="function"><span class="keyword">func</span><span class="params">()</span></span> &#123;</span><br><span class="line"><span class="keyword">defer</span> utilruntime.HandleCrash()</span><br><span class="line">e.checkLeftoverEndpoints()</span><br><span class="line">&#125;()</span><br><span class="line"></span><br><span class="line">&lt;-ctx.Done()</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure><p>然后 <code>worker</code> -&gt; <code>processNextWorkItem</code> -&gt; <code>syncService</code> 最终在里面处理。</p><h3 id="syncService"><a href="#syncService" class="headerlink" title="syncService"></a>syncService</h3><p>这个方法很长对吧？200+ 行，但其实本身并不复杂。本质就做了两件大事。我精简了部分之后，大体看上去是这样的。</p><figure class="highlight go"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br><span class="line">29</span><br><span class="line">30</span><br><span class="line">31</span><br><span class="line">32</span><br><span class="line">33</span><br><span class="line">34</span><br><span class="line">35</span><br><span class="line">36</span><br><span class="line">37</span><br><span class="line">38</span><br><span class="line">39</span><br><span class="line">40</span><br><span class="line">41</span><br><span class="line">42</span><br><span class="line">43</span><br><span class="line">44</span><br><span class="line">45</span><br><span class="line">46</span><br><span class="line">47</span><br><span class="line">48</span><br><span class="line">49</span><br><span class="line">50</span><br><span class="line">51</span><br><span class="line">52</span><br><span class="line">53</span><br><span class="line">54</span><br><span class="line">55</span><br><span class="line">56</span><br><span class="line">57</span><br><span class="line">58</span><br><span class="line">59</span><br><span class="line">60</span><br><span class="line">61</span><br><span class="line">62</span><br><span class="line">63</span><br><span class="line">64</span><br><span class="line">65</span><br><span class="line">66</span><br><span class="line">67</span><br><span class="line">68</span><br><span class="line">69</span><br><span class="line">70</span><br><span class="line">71</span><br><span class="line">72</span><br><span class="line">73</span><br><span class="line">74</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment">// pkg/controller/endpoint/endpoints_controller.go:358</span></span><br><span class="line"><span class="function"><span class="keyword">func</span> <span class="params">(e *Controller)</span></span> syncService(ctx context.Context, key <span class="type">string</span>) <span class="type">error</span> &#123;</span><br><span class="line">namespace, name, err := cache.SplitMetaNamespaceKey(key)</span><br><span class="line"></span><br><span class="line">service, err := e.serviceLister.Services(namespace).Get(name)</span><br><span class="line"></span><br><span class="line">pods, err := e.podLister.Pods(service.Namespace).List(labels.Set(service.Spec.Selector).AsSelectorPreValidated())</span><br><span class="line"></span><br><span class="line">subsets := []v1.EndpointSubset&#123;&#125;</span><br><span class="line"><span class="keyword">var</span> totalReadyEps <span class="type">int</span></span><br><span class="line"><span class="keyword">var</span> totalNotReadyEps <span class="type">int</span></span><br><span class="line"></span><br><span class="line"><span class="keyword">for</span> _, pod := <span class="keyword">range</span> pods &#123;</span><br><span class="line">ep, err := podToEndpointAddressForService(service, pod)</span><br><span class="line"></span><br><span class="line"><span class="comment">// Allow headless service not to have ports.</span></span><br><span class="line"><span class="keyword">if</span> <span class="built_in">len</span>(service.Spec.Ports) == <span class="number">0</span> &#123;</span><br><span class="line"><span class="keyword">if</span> service.Spec.ClusterIP == api.ClusterIPNone &#123;</span><br><span class="line">subsets, totalReadyEps, totalNotReadyEps = addEndpointSubset(logger, subsets, pod, epa, <span class="literal">nil</span>, service.Spec.PublishNotReadyAddresses)</span><br><span class="line"><span class="comment">// No need to repack subsets for headless service without ports.</span></span><br><span class="line">&#125;</span><br><span class="line">&#125; <span class="keyword">else</span> &#123;</span><br><span class="line"><span class="keyword">for</span> i := <span class="keyword">range</span> service.Spec.Ports &#123;</span><br><span class="line">servicePort := &amp;service.Spec.Ports[i]</span><br><span class="line">portNum, err := podutil.FindPort(pod, servicePort)</span><br><span class="line"><span class="keyword">if</span> err != <span class="literal">nil</span> &#123;</span><br><span class="line">logger.V(<span class="number">4</span>).Info(<span class="string">&quot;Failed to find port for service&quot;</span>, <span class="string">&quot;service&quot;</span>, klog.KObj(service), <span class="string">&quot;error&quot;</span>, err)</span><br><span class="line"><span class="keyword">continue</span></span><br><span class="line">&#125;</span><br><span class="line">epp := endpointPortFromServicePort(servicePort, portNum)</span><br><span class="line"></span><br><span class="line"><span class="keyword">var</span> readyEps, notReadyEps <span class="type">int</span></span><br><span class="line">subsets, readyEps, notReadyEps = addEndpointSubset(logger, subsets, pod, epa, epp, service.Spec.PublishNotReadyAddresses)</span><br><span class="line">totalReadyEps = totalReadyEps + readyEps</span><br><span class="line">totalNotReadyEps = totalNotReadyEps + notReadyEps</span><br><span class="line">&#125;</span><br><span class="line">&#125;</span><br><span class="line">&#125;</span><br><span class="line">subsets = endpoints.RepackSubsets(subsets)</span><br><span class="line"></span><br><span class="line"><span class="comment">// See if there&#x27;s actually an update here.</span></span><br><span class="line">currentEndpoints, err := e.endpointsLister.Endpoints(service.Namespace).Get(service.Name)</span><br><span class="line"></span><br><span class="line">createEndpoints := <span class="built_in">len</span>(currentEndpoints.ResourceVersion) == <span class="number">0</span></span><br><span class="line">compareLabels := currentEndpoints.Labels</span><br><span class="line"><span class="keyword">if</span> _, ok := currentEndpoints.Labels[v1.IsHeadlessService]; ok &#123;</span><br><span class="line">compareLabels = utillabels.CloneAndRemoveLabel(currentEndpoints.Labels, v1.IsHeadlessService)</span><br><span class="line">&#125;</span><br><span class="line"><span class="comment">// When comparing the subsets, we ignore the difference in ResourceVersion of Pod to avoid unnecessary Endpoints</span></span><br><span class="line"><span class="comment">// updates caused by Pod updates that we don&#x27;t care, e.g. annotation update.</span></span><br><span class="line"><span class="keyword">if</span> !createEndpoints &amp;&amp;</span><br><span class="line">endpointSubsetsEqualIgnoreResourceVersion(currentEndpoints.Subsets, subsets) &amp;&amp;</span><br><span class="line">apiequality.Semantic.DeepEqual(compareLabels, service.Labels) &amp;&amp;</span><br><span class="line">capacityAnnotationSetCorrectly(currentEndpoints.Annotations, currentEndpoints.Subsets) &#123;</span><br><span class="line">logger.V(<span class="number">5</span>).Info(<span class="string">&quot;endpoints are equal, skipping update&quot;</span>, <span class="string">&quot;service&quot;</span>, klog.KObj(service))</span><br><span class="line"><span class="keyword">return</span> <span class="literal">nil</span></span><br><span class="line">&#125;</span><br><span class="line">newEndpoints := currentEndpoints.DeepCopy()</span><br><span class="line">newEndpoints.Subsets = subsets</span><br><span class="line">newEndpoints.Labels = service.Labels</span><br><span class="line"></span><br><span class="line"></span><br><span class="line">logger.V(<span class="number">4</span>).Info(<span class="string">&quot;Update endpoints&quot;</span>, <span class="string">&quot;service&quot;</span>, klog.KObj(service), <span class="string">&quot;readyEndpoints&quot;</span>, totalReadyEps, <span class="string">&quot;notreadyEndpoints&quot;</span>, totalNotReadyEps)</span><br><span class="line"><span class="keyword">var</span> updatedEndpoints *v1.Endpoints</span><br><span class="line"><span class="keyword">if</span> createEndpoints &#123;</span><br><span class="line"><span class="comment">// No previous endpoints, create them</span></span><br><span class="line">_, err = e.client.CoreV1().Endpoints(service.Namespace).Create(ctx, newEndpoints, metav1.CreateOptions&#123;&#125;)</span><br><span class="line">&#125; <span class="keyword">else</span> &#123;</span><br><span class="line"><span class="comment">// Pre-existing</span></span><br><span class="line">updatedEndpoints, err = e.client.CoreV1().Endpoints(service.Namespace).Update(ctx, newEndpoints, metav1.UpdateOptions&#123;&#125;)</span><br><span class="line">&#125;</span><br><span class="line"></span><br><span class="line"><span class="keyword">return</span> <span class="literal">nil</span></span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure><p>第一件事情很简单，Service 最关键的东西是什么？port 其实就是端口，因为除了 ip 和端口，你在 Service 里面也很少配置其他东西了不是吗… 所以其实看到的第一件事情也很简单，就是遍历所有的 service 将其中有 Ports 的情况搬出来为后面准备。因为对于 ClusterIP 来说，内部访问，那么知道 ip + 端口才能行。也就是前半部 <code>for _, pod := range pods &#123;</code> 里面的事情。</p><p>然后第二件事情是在 <code>// See if there&#39;s actually an update here.</code> 这句话开始的时候（注释已经说的很明显了）去根据上面查询出来的情况对比新旧情况，去配置对应的 <code>Endpoints</code> ，当然这也是为什么它称为 <code>endpoints_controller</code> 的原因。最终如果需要创建 <code>createEndpoints</code> 则创建，如果需要更新 <code>updatedEndpoints</code> 更新。</p><p>好了，这个时候有同学就要问题 Endpoint 是啥呀，我怎么从来没听过呢？其实它也是 k8s 里面一个对象，只是由于我们不需要直接接触和配置它，通常使用的时候接触不到。你可以通过</p><figure class="highlight bash"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line">kubectl get endpoints --all-namespaces</span><br></pre></td></tr></table></figure><p>这样的命令看到，这不就是我们内部访问的 ip 和 端口吗？</p><figure class="highlight plaintext"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br></pre></td><td class="code"><pre><span class="line">kube-system                    kube-controller-manager-svc               10.0.10.205:10257                                                   2y111d</span><br><span class="line">kube-system                    kube-dns                                  172.16.0.183:53,172.16.0.191:53,172.16.0.183:53 + 3 more...         2y118d</span><br><span class="line">kube-system                    kube-scheduler-svc                        10.0.10.205:10259                                                   2y111d</span><br><span class="line">kube-system                    kubelet                                   10.0.10.207:10250,10.0.10.209:10250,10.0.10.205:10250 + 6 more...   2y111d</span><br><span class="line">kube-system                    metrics-server                            10.0.10.207:4443                                                    2y111d</span><br></pre></td></tr></table></figure><p>也就是说，其实 service 只要正常的配置，正常的更新就可以了，而在 <code>EndpointController</code> 里面会根据这部分更新来更新 Endpoints ，从而配置好内部的访问情况。这也为什么我让你看了 EndpointController 的原因。这也是为什么我说整体比较简单的原因，因为其实本质上没有复杂的业务逻辑和调度切换关系。只有配置的更新。</p><h2 id="码后解答"><a href="#码后解答" class="headerlink" title="码后解答"></a>码后解答</h2><ol><li>Service 创建之后做了些什么？<ol><li>从 Endpoint 角度来说，Service 的更新就直接影响了 Endpoint 的配置，根据不同的配置端口信息有了不同的配置</li></ol></li></ol><h2 id="总结提升"><a href="#总结提升" class="headerlink" title="总结提升"></a>总结提升</h2><h3 id="编码上"><a href="#编码上" class="headerlink" title="编码上"></a>编码上</h3><p>这次在编码上有看到一个小细节，在 <code>Run</code> 的时候，使用了一个 <code>defer utilruntime.HandleCrash()</code>，别小看它，虽然只是对于一个 panic 的 recover 封装处理，但要记住，很多写法都会导致 panic 无法正确被捕获到哦。然后，内部默认有一个 PanicHandler，也可以自己传入 handler 去处理。最后判断 <code>ReallyCrash</code> 确认是否真的要崩溃。</p><figure class="highlight go"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br></pre></td><td class="code"><pre><span class="line"><span class="function"><span class="keyword">func</span> <span class="title">HandleCrash</span><span class="params">(additionalHandlers ...<span class="keyword">func</span>(<span class="keyword">interface</span>&#123;&#125;)</span></span>) &#123;</span><br><span class="line"><span class="keyword">if</span> r := <span class="built_in">recover</span>(); r != <span class="literal">nil</span> &#123;</span><br><span class="line"><span class="keyword">for</span> _, fn := <span class="keyword">range</span> PanicHandlers &#123;</span><br><span class="line">fn(r)</span><br><span class="line">&#125;</span><br><span class="line"><span class="keyword">for</span> _, fn := <span class="keyword">range</span> additionalHandlers &#123;</span><br><span class="line">fn(r)</span><br><span class="line">&#125;</span><br><span class="line"><span class="keyword">if</span> ReallyCrash &#123;</span><br><span class="line"><span class="comment">// Actually proceed to panic.</span></span><br><span class="line"><span class="built_in">panic</span>(r)</span><br><span class="line">&#125;</span><br><span class="line">&#125;</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure><p>总之是一个很不错的 panic 的 recover 封装，可以直接拿来使用。</p><h3 id="设计上"><a href="#设计上" class="headerlink" title="设计上"></a>设计上</h3><p>其实和其他设计一样，都是对象的改变引起了配置的改变。而每一个改变也都是通过 Informer 机制联系在一起的。其实这一节的关键是要告诉你，有 Endpoint 这个对象的存在，有了它，映射关系就有了，就能定位到一个对象应该如何被正确访问了。当然这一节我们只是定位到了配置上的变动，下面我们将从原理上看看流量到底是如何被转发的。</p>]]>
    </content>
    
    
      
      
        
        
    <summary type="html">&lt;blockquote&gt;
&lt;p&gt;📢 注意，该文本非最终版本，正在更新中，版权所有，请勿转载！！&lt;/p&gt;
&lt;/blockquote&gt;
&lt;h2 id=&quot;前言&quot;&gt;&lt;a href=&quot;#前言&quot; class=&quot;headerlink&quot;</summary>
        
      
    
    
    
    <category term="kubernetes" scheme="https://www.linkinstars.com/categories/kubernetes/"/>
    
    
    <category term="kubernetes" scheme="https://www.linkinstars.com/tags/kubernetes/"/>
    
  </entry>
  
  <entry>
    <title>The little dict 一个非常好用的欧陆词典扩展</title>
    <link href="https://www.linkinstars.com/post/952c166b.html"/>
    <id>https://www.linkinstars.com/post/952c166b.html</id>
    <published>2024-10-09T16:00:00.000Z</published>
    <updated>2024-11-08T09:42:39.773Z</updated>
    
    <content type="html">
    <![CDATA[<a href="https://www.linkinstars.com/post/952c166b.html">RSS 阅读体验可能不太好，若喜欢本文请点此跳转原文查看~</a><br><br>]]>
    <![CDATA[<h2 id="前言"><a href="#前言" class="headerlink" title="前言"></a>前言</h2><p>最近发现一个很不错的欧陆词典扩展，叫做 The little dict，给出的解释的同时还有一些使用频率以及各种解释，非常适合英语学习者使用。忍不住分享一下。</p><h2 id="下载地址"><a href="#下载地址" class="headerlink" title="下载地址"></a>下载地址</h2><blockquote><p>网上找的，你也可以自己找找看</p></blockquote><p><a href="https://pan.baidu.com/s/1OblabSK7M7jxGd_bSQaZzg#list/path=%2F">https://pan.baidu.com/s/1OblabSK7M7jxGd_bSQaZzg#list/path=%2F</a></p><p>提取码：s5z8</p><h2 id="安装"><a href="#安装" class="headerlink" title="安装"></a>安装</h2><p>其实安装非常简单，下载之后解压，然后在管理-已安装词库的界面，将文件夹拖到里面即可。注意原始文件夹不要删掉。</p><p><img src="https://blog.linkinstars.com/blog/eudic-the-little-dict-install.png" alt="install"></p><h2 id="预览"><a href="#预览" class="headerlink" title="预览"></a>预览</h2><p><img src="https://blog.linkinstars.com/blog/eudic-the-little-dict-0.gif" alt="install"></p><p><img src="https://blog.linkinstars.com/blog/eudic-the-little-dict-1.png" alt="install"></p><p><img src="https://blog.linkinstars.com/blog/eudic-the-little-dict-2.png" alt="install"></p><p><img src="https://blog.linkinstars.com/blog/eudic-the-little-dict-3.png" alt="install"></p>]]>
    </content>
    
    
      
      
        
        
    <summary type="html">&lt;h2 id=&quot;前言&quot;&gt;&lt;a href=&quot;#前言&quot; class=&quot;headerlink&quot; title=&quot;前言&quot;&gt;&lt;/a&gt;前言&lt;/h2&gt;&lt;p&gt;最近发现一个很不错的欧陆词典扩展，叫做 The little</summary>
        
      
    
    
    
    <category term="macos-hint" scheme="https://www.linkinstars.com/categories/macos-hint/"/>
    
    
    <category term="eudic" scheme="https://www.linkinstars.com/tags/eudic/"/>
    
  </entry>
  
  <entry>
    <title>新手上路 Rust 的字符串</title>
    <link href="https://www.linkinstars.com/post/3010f6a0.html"/>
    <id>https://www.linkinstars.com/post/3010f6a0.html</id>
    <published>2024-09-24T16:00:00.000Z</published>
    <updated>2024-10-22T06:30:24.064Z</updated>
    
    <content type="html">
    <![CDATA[<a href="https://www.linkinstars.com/post/3010f6a0.html">RSS 阅读体验可能不太好，若喜欢本文请点此跳转原文查看~</a><br><br>]]>
    <![CDATA[<h2 id="前言"><a href="#前言" class="headerlink" title="前言"></a>前言</h2><p>相比与 Golang 或者是其他语言的字符串的设计 Rust 对于字符串的设计就要复杂不少，新手第一坑可能就容易掉进去，并且觉得很难用。不过其实掌握其中的几个基本概念，就会发现也就这么回事。今天我们来唠唠让人头疼的字符串。</p><h2 id="两个问题"><a href="#两个问题" class="headerlink" title="两个问题"></a>两个问题</h2><p>一般来说，很少有人第一个语言上手就是 rust ，当带着其他语言的影子来使用 rust 的时候势必会遇到下面两个问题。</p><h3 id="问题-1-直接写的不是字符串"><a href="#问题-1-直接写的不是字符串" class="headerlink" title="问题 1: 直接写的不是字符串"></a>问题 1: 直接写的不是字符串</h3><p>如下面的代码，就会报错</p><figure class="highlight rust"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br></pre></td><td class="code"><pre><span class="line"><span class="keyword">fn</span> <span class="title function_">main</span>() &#123;</span><br><span class="line">    <span class="keyword">let</span> <span class="variable">s</span> = <span class="string">&quot;我是一个字符串&quot;</span>;</span><br><span class="line">    <span class="title function_ invoke__">check</span>(s);</span><br><span class="line">&#125;</span><br><span class="line"></span><br><span class="line"><span class="keyword">fn</span> <span class="title function_">check</span>(s: <span class="type">String</span>) &#123;</span><br><span class="line">    <span class="built_in">println!</span>(<span class="string">&quot;不，&#123;&#125;，你不是个字符串!&quot;</span>, s);</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure><p>新手遇到的第一个问题就是直接写字符串 “xxx” ，发现不好用。</p><p>那么第一个误区就来了，”xxx” 这样的表示通常用字面量来称呼，而在 rust 里面，直接这样写，s 就是一个 <code>&amp;str</code> 类型，可以理解为</p><figure class="highlight rust"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line"><span class="keyword">let</span> <span class="variable">s</span>: &amp;<span class="type">str</span> = <span class="string">&quot;xxx&quot;</span>;</span><br></pre></td></tr></table></figure><p>至于 &amp;str 是什么后面会说到。</p><h3 id="问题-2-不能索引"><a href="#问题-2-不能索引" class="headerlink" title="问题 2: 不能索引"></a>问题 2: 不能索引</h3><p>这个问题一开始我也痛苦了半天，不理解的时候特别难受。</p><figure class="highlight rust"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br></pre></td><td class="code"><pre><span class="line"><span class="keyword">fn</span> <span class="title function_">check</span>(s: <span class="type">String</span>) &#123;</span><br><span class="line">    <span class="keyword">let</span> <span class="variable">a</span> = s[<span class="number">1</span>];</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure><p>像别的语言，都可以通过下标来索引到字符串具体的位置上的字符，而 rust 不行，如上的代码就会报错。</p><p>其实原因说起来也简单，在其他语言，比如 Go，在做字符串切分的时候，比如取字符串长度前面为 8 的字串，那么 <code>s[0:8]</code> 对吧。但是如果内容是中文或其他语言，截取的长度就是不对的，由于每个字符所占用的字节是不一样的，utf8 编码下。所以在做字符串截取的时候，要特别处理。而由于我们常常截取都是英文，所以没出现问题，当然截取的本质就是索引下标。</p><p>而 Rust 到好，直接一刀切，压根就不能索引也就不存在截取的问题了。<strong>直接从语言层面就让你明白这个问题其实一直都在</strong>。</p><h4 id="如何遍历"><a href="#如何遍历" class="headerlink" title="如何遍历"></a>如何遍历</h4><p>先解决一下你的疑虑，如果不能索引，那么字符串如何遍历呢？答案是 chars()</p><figure class="highlight rust"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br></pre></td><td class="code"><pre><span class="line"><span class="keyword">for</span> <span class="variable">c</span> <span class="keyword">in</span> <span class="string">&quot;我是一个字符串&quot;</span>.<span class="title function_ invoke__">chars</span>() &#123;</span><br><span class="line">    <span class="built_in">println!</span>(<span class="string">&quot;&#123;&#125;&quot;</span>, c);</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure><p>是的，你只能通过这样的方式去做了</p><h2 id="其实就两个东西"><a href="#其实就两个东西" class="headerlink" title="其实就两个东西"></a>其实就两个东西</h2><p>首先，rust 里面有很多其他类型的字符串，但最基础的情况下，你只需要认识两个东西就可以了，<code>&amp;str</code> 和 <code>String</code></p><ul><li><code>&amp;str</code> 其本质就是切片的引用，切片在很多语言下也有，你可以简单把它理解成为视图的概念</li><li><code>String</code> 而这才是真正我们常用的字符串，可以对它做删除、拼接等等操作</li></ul><p><code>&amp;str</code> 可以转 <code>String</code> 通过 <code>to_string()</code> 方法就可以，也就是一开始的代码需要这样写。</p><figure class="highlight rust"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br></pre></td><td class="code"><pre><span class="line"><span class="keyword">fn</span> <span class="title function_">main</span>() &#123;</span><br><span class="line">    <span class="keyword">let</span> <span class="variable">s</span> = <span class="string">&quot;我是一个字符串&quot;</span>;</span><br><span class="line">    <span class="title function_ invoke__">check</span>(s.<span class="title function_ invoke__">to_string</span>());</span><br><span class="line">&#125;</span><br><span class="line"></span><br><span class="line"><span class="keyword">fn</span> <span class="title function_">check</span>(s: <span class="type">String</span>) &#123;</span><br><span class="line">    <span class="built_in">println!</span>(<span class="string">&quot;不，&#123;&#125;，你不是个字符串!&quot;</span>, s);</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure><p>而 String 的使用更为常见，主要是声明：</p><figure class="highlight rust"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line"><span class="keyword">let</span> <span class="variable">s</span> = <span class="type">String</span>::<span class="title function_ invoke__">from</span>(<span class="string">&quot;我是一个字符串&quot;</span>);</span><br></pre></td></tr></table></figure><p>这样才是一个 String 该有的样子，虽然我知道这样写很多人就会觉得，哇，好麻烦。</p><p>而字符串就很容易了，它就可以使用 <code>push、replace</code> 等等方法做操作了，指定注意的是，你需要关心它的方法是否操作了原来的字符串还是新返回了一个字符串。这个在 IDE 里面都是有提示的，问题不大。</p><h2 id="弄清楚可变不可变"><a href="#弄清楚可变不可变" class="headerlink" title="弄清楚可变不可变"></a>弄清楚可变不可变</h2><p>可，rust 为什么要这样设计呢？将一个原本简单的 “” 变得复杂？</p><p>答案其实在 <code>mut</code> ，当然不是这个关键字，而是可变不可变。<strong>当一个语言把 “让可以对象可变” 需要特别使用可以关键字 mut 的时候，你就知道它对可变不可变的关心有多么强烈</strong>。</p><p>字符串也是一样的。之所以 “xxx” 是 <code>&amp;str</code> 类型，本质来说就是想让大家知道它不可变，由于 “xxx” 这样的声明往往都是以静态方式放在最终应用里面的，也就是说硬编码到了可执行文件里面去。而对于动态可变的 String 来说，rust 需要时刻关注它使用完成之后需要将内存释放掉。也就是对于这样可变不可变的关注，从而让原本 “xxx” 变得不太一样。</p><h2 id="总结"><a href="#总结" class="headerlink" title="总结"></a>总结</h2><p>在你看来 rust 很多奇怪的设计，往往都和 “为了去掉 GC” 相挂钩。除了这点，从字符串不能索引也可以看出，在语言层面 rust 想要限制很多它认为不对的事情。最后，依然要夸一下 rust 编译器的强大，当你字符串使用的过程中出现意外的时候，编译器往往提供了大量的信息，并且明确指出了出现错误的原因，我给你的建议是，慢慢看，其实它不是在说废话。</p>]]>
    </content>
    
    
      
      
        
        
    <summary type="html">&lt;h2 id=&quot;前言&quot;&gt;&lt;a href=&quot;#前言&quot; class=&quot;headerlink&quot; title=&quot;前言&quot;&gt;&lt;/a&gt;前言&lt;/h2&gt;&lt;p&gt;相比与 Golang 或者是其他语言的字符串的设计 Rust</summary>
        
      
    
    
    
    <category term="rust" scheme="https://www.linkinstars.com/categories/rust/"/>
    
    
    <category term="rust" scheme="https://www.linkinstars.com/tags/rust/"/>
    
  </entry>
  
  <entry>
    <title>《一起读 kubernetes 源码》简单的 DaemonSet</title>
    <link href="https://www.linkinstars.com/post/30ecfe33.html"/>
    <id>https://www.linkinstars.com/post/30ecfe33.html</id>
    <published>2024-09-14T16:00:00.000Z</published>
    <updated>2024-09-05T06:38:41.519Z</updated>
    
    <content type="html">
    <![CDATA[<a href="https://www.linkinstars.com/post/30ecfe33.html">RSS 阅读体验可能不太好，若喜欢本文请点此跳转原文查看~</a><br><br>]]>
    <![CDATA[<blockquote><p>📢 注意，该文本非最终版本，正在更新中，版权所有，请勿转载！！</p></blockquote><h2 id="前言"><a href="#前言" class="headerlink" title="前言"></a>前言</h2><p>相比较于 deployment 和 StatefulSet，DaemonSet 是更简单的一个，也是最不常用的一个对象了。对于应用开发的同学来说可能几乎见不到它，而对于运维或者 SRE 的同学可能会熟悉一些。DaemonSet 用于确保集群中的<strong>每个节点运行有且仅有一个 pod 实例的场景</strong>。两个最常见的场景是：日志收集和监控。日志收集是为了收集每个节点上的日志，而监控则是为了监控每个节点的一些数据指标。通常来说以全局平台或者节点为场景的情况下才会想到它。那么 DaemonSet 的如何保证每个节点 pod 的数量呢？这一节让我从源码的角度看看它是如何实现的。</p><h2 id="前置知识"><a href="#前置知识" class="headerlink" title="前置知识"></a>前置知识</h2><ul><li>DaemonSet 的基本使用</li></ul><h2 id="码前讨论"><a href="#码前讨论" class="headerlink" title="码前讨论"></a>码前讨论</h2><p>首先代码位置就不多说了，有前面的经验。</p><figure class="highlight go"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line">kubernetes/pkg/controller/daemon</span><br></pre></td></tr></table></figure><p>由于前面我们已经看过了 deployment、rs、StatefulSet，那么其实对于 DaemonSet，我们也是一样几乎大致的形态结构都已经可以八九不离十了，而且它只有 <code>daemon_controller.go</code> 和 <code>update.go</code> 两个文件，就像我前面说的也它其实很简单，并且功能也不复杂。所以这次我们换一种方式来认识源码，放大之前提问的部分。我们在最开始第一节的时候就提到过，看源码之前提几个问题能帮助我们快速进入状态和定位关键。而对于熟悉的结构，我们更可以通过这样的方式来快速阅读源码，而非逐字逐句去做翻译。</p><h2 id="码前提问"><a href="#码前提问" class="headerlink" title="码前提问"></a>码前提问</h2><h3 id="问题-1"><a href="#问题-1" class="headerlink" title="问题 1"></a>问题 1</h3><p>我们知道 DaemonSet 确保集群中每个节点有且仅有一个 pod ，那么当节点数量变化的时候，它一定会随之改变，那么 DaemonSet 的 controller 是如何感知这个变化的呢？如果是你去编写，你会从何处入手？在看源码之前你可以先大胆假设一下。</p><h3 id="问题-2"><a href="#问题-2" class="headerlink" title="问题 2"></a>问题 2</h3><p>关键的问题在于 DaemonSet 是如何保证集群中每个节点有且仅有一个 pod 的呢？需要做哪些设置呢。同样的，再看源码之前，你可以先问问自己，不是 DaemonSet 的情况，如果是一个普通的 deployment 你能否做的让 pod 调度到每个节点一个？如果可以，那么 DaemonSet 或许就是类似的思路。</p><h3 id="问题-3"><a href="#问题-3" class="headerlink" title="问题 3"></a>问题 3</h3><p>为了保证 pod 的关系和数量，我会猜测 DaemonSet 可能需要存 node 和 pod 的对应关系，如果有，是存在了哪里？</p><p>你可以先不看下面的分析，自己去寻找这三个问题的答案，找到之后再回来核对，看看是否与你的想法一致。</p><hr><h2 id="源码分析"><a href="#源码分析" class="headerlink" title="源码分析"></a>源码分析</h2><h3 id="问题-1-1"><a href="#问题-1-1" class="headerlink" title="问题 1"></a>问题 1</h3><blockquote><p>DaemonSet 是如何感知节点的变化的？</p></blockquote><p>第一个问题相对来说比较简单。由于我们之前看过的所有对象来说，无论是对象本身的变化，还是 pod 的变化都是通过 informer 机制来告诉 controller 的。所以 node 的变化也无意外，也是通过这样的事件机制来做的。</p><figure class="highlight go"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br><span class="line">29</span><br><span class="line">30</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment">// pkg/controller/daemon/daemon_controller.go:134</span></span><br><span class="line"><span class="comment">// NewDaemonSetsController creates a new DaemonSetsController</span></span><br><span class="line"><span class="function"><span class="keyword">func</span> <span class="title">NewDaemonSetsController</span><span class="params">(</span></span></span><br><span class="line"><span class="params"><span class="function">ctx context.Context,</span></span></span><br><span class="line"><span class="params"><span class="function">daemonSetInformer appsinformers.DaemonSetInformer,</span></span></span><br><span class="line"><span class="params"><span class="function">historyInformer appsinformers.ControllerRevisionInformer,</span></span></span><br><span class="line"><span class="params"><span class="function">podInformer coreinformers.PodInformer,</span></span></span><br><span class="line"><span class="params"><span class="function">nodeInformer coreinformers.NodeInformer,</span></span></span><br><span class="line"><span class="params"><span class="function">kubeClient clientset.Interface,</span></span></span><br><span class="line"><span class="params"><span class="function">failedPodsBackoff *flowcontrol.Backoff,</span></span></span><br><span class="line"><span class="params"><span class="function">)</span></span> (*DaemonSetsController, <span class="type">error</span>) &#123;</span><br><span class="line">eventBroadcaster := record.NewBroadcaster()</span><br><span class="line"><span class="comment">// ...</span></span><br><span class="line"></span><br><span class="line">nodeInformer.Informer().AddEventHandler(cache.ResourceEventHandlerFuncs&#123;</span><br><span class="line">AddFunc: <span class="function"><span class="keyword">func</span><span class="params">(obj <span class="keyword">interface</span>&#123;&#125;)</span></span> &#123;</span><br><span class="line">dsc.addNode(logger, obj)</span><br><span class="line">&#125;,</span><br><span class="line">UpdateFunc: <span class="function"><span class="keyword">func</span><span class="params">(oldObj, newObj <span class="keyword">interface</span>&#123;&#125;)</span></span> &#123;</span><br><span class="line">dsc.updateNode(logger, oldObj, newObj)</span><br><span class="line">&#125;,</span><br><span class="line">&#125;,</span><br><span class="line">)</span><br><span class="line">dsc.nodeStoreSynced = nodeInformer.Informer().HasSynced</span><br><span class="line">dsc.nodeLister = nodeInformer.Lister()</span><br><span class="line"></span><br><span class="line"><span class="comment">// ...</span></span><br><span class="line"></span><br><span class="line"><span class="keyword">return</span> dsc, <span class="literal">nil</span></span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure><p>在 <code>NewDaemonSetsController</code> 方法中可以明确看到，通过 nodeInformer 添加了有关节点变化的 event 处理方法，当有对应事件的时候，也就是 node 有变化的时候我们就能知道，并做出相应的调整。</p><p>如果这部分你能在看源码之前猜测到，那我觉得对于 informer 整个机制应该是真的掌握了。</p><h3 id="问题-2-1"><a href="#问题-2-1" class="headerlink" title="问题 2"></a>问题 2</h3><blockquote><p>DaemonSet 是如何保证集群中每个节点有且仅有一个 pod 的?</p></blockquote><p>这个问题稍微复杂一些，考查了你对于 k8s 一些基础概念的了解。我特别也没有在前置知识里面提及是怕过早公布答案。首先，让我们来想一下后面一个小问题，也就是如何让 deployment 能均匀分布到各个节点上去。</p><p>如果想把某个 pod 直接调度到特定的节点上，我们可以直接在 spec 下配置 nodeName 来解决。</p><figure class="highlight yaml"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br></pre></td><td class="code"><pre><span class="line"><span class="attr">apiVersion:</span> <span class="string">v1</span></span><br><span class="line"><span class="attr">kind:</span> <span class="string">Pod</span></span><br><span class="line"><span class="attr">metadata:</span></span><br><span class="line">  <span class="attr">name:</span> <span class="string">nginx</span></span><br><span class="line"><span class="attr">spec:</span></span><br><span class="line">  <span class="attr">nodeName:</span> <span class="string">foo-node</span> <span class="comment"># 调度 Pod 到特定的节点</span></span><br></pre></td></tr></table></figure><p>而对于整个对象 deployment 或者是 statefulset，那么答案是 <strong>亲和性</strong> 。比如官方就给出过对于 zk 的部署最佳实践中就提到，让 statefulset 的 pod 分布到不同的节点，以保证更好的高可用，不会因为所有 pod 都在一个节点，而这个节点挂了就一起挂了的情况。如下：<a href="https://kubernetes.io/zh-cn/docs/tutorials/stateful-application/zookeeper/#tolerating-node-failure">https://kubernetes.io/zh-cn/docs/tutorials/stateful-application/zookeeper/#tolerating-node-failure</a></p><figure class="highlight yaml"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br></pre></td><td class="code"><pre><span class="line"><span class="attr">affinity:</span></span><br><span class="line">  <span class="attr">podAntiAffinity:</span></span><br><span class="line">    <span class="attr">requiredDuringSchedulingIgnoredDuringExecution:</span></span><br><span class="line">      <span class="bullet">-</span> <span class="attr">labelSelector:</span></span><br><span class="line">          <span class="attr">matchExpressions:</span></span><br><span class="line">            <span class="bullet">-</span> <span class="attr">key:</span> <span class="string">&quot;app&quot;</span></span><br><span class="line">              <span class="attr">operator:</span> <span class="string">In</span></span><br><span class="line">              <span class="attr">values:</span></span><br><span class="line">                <span class="bullet">-</span> <span class="string">zk</span></span><br><span class="line">        <span class="attr">topologyKey:</span> <span class="string">&quot;kubernetes.io/hostname&quot;</span></span><br></pre></td></tr></table></figure><p>那么 DaemonSet 很大程度上会参考这样的规则，让调度器能把 pod 按照我们的要求每个节点调度一个。</p><p>于是乎，我们可以在源码中寻找来印证我们的想法：</p><figure class="highlight go"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br><span class="line">29</span><br><span class="line">30</span><br><span class="line">31</span><br><span class="line">32</span><br><span class="line">33</span><br><span class="line">34</span><br><span class="line">35</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment">// pkg/controller/daemon/daemon_controller.go:993</span></span><br><span class="line"><span class="comment">// syncNodes deletes given pods and creates new daemon set pods on the given nodes</span></span><br><span class="line"><span class="comment">// returns slice with errors if any</span></span><br><span class="line"><span class="function"><span class="keyword">func</span> <span class="params">(dsc *DaemonSetsController)</span></span> syncNodes(ctx context.Context, ds *apps.DaemonSet, podsToDelete, nodesNeedingDaemonPods []<span class="type">string</span>, hash <span class="type">string</span>) <span class="type">error</span> &#123;</span><br><span class="line"><span class="comment">// ...</span></span><br><span class="line"></span><br><span class="line">batchSize := integer.IntMin(createDiff, controller.SlowStartInitialBatchSize)</span><br><span class="line"><span class="keyword">for</span> pos := <span class="number">0</span>; createDiff &gt; pos; batchSize, pos = integer.IntMin(<span class="number">2</span>*batchSize, createDiff-(pos+batchSize)), pos+batchSize &#123;</span><br><span class="line">errorCount := <span class="built_in">len</span>(errCh)</span><br><span class="line">createWait.Add(batchSize)</span><br><span class="line"><span class="keyword">for</span> i := pos; i &lt; pos+batchSize; i++ &#123;</span><br><span class="line"><span class="keyword">go</span> <span class="function"><span class="keyword">func</span><span class="params">(ix <span class="type">int</span>)</span></span> &#123;</span><br><span class="line"><span class="keyword">defer</span> createWait.Done()</span><br><span class="line"></span><br><span class="line">podTemplate := template.DeepCopy()</span><br><span class="line"><span class="comment">// The pod&#x27;s NodeAffinity will be updated to make sure the Pod is bound</span></span><br><span class="line"><span class="comment">// to the target node by default scheduler. It is safe to do so because there</span></span><br><span class="line"><span class="comment">// should be no conflicting node affinity with the target node.</span></span><br><span class="line">podTemplate.Spec.Affinity = util.ReplaceDaemonSetPodNodeNameNodeAffinity(</span><br><span class="line">podTemplate.Spec.Affinity, nodesNeedingDaemonPods[ix])</span><br><span class="line"></span><br><span class="line">err := dsc.podControl.CreatePods(ctx, ds.Namespace, podTemplate,</span><br><span class="line">ds, metav1.NewControllerRef(ds, controllerKind))</span><br><span class="line"></span><br><span class="line"><span class="comment">// ...</span></span><br><span class="line">&#125;(i)</span><br><span class="line">&#125;</span><br><span class="line">createWait.Wait()</span><br><span class="line"><span class="comment">// any skipped pods that we never attempted to start shouldn&#x27;t be expected.</span></span><br><span class="line">skippedPods := createDiff - (batchSize + pos)</span><br><span class="line"><span class="comment">// ...</span></span><br><span class="line">&#125;</span><br><span class="line"></span><br><span class="line"><span class="comment">// ...</span></span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure><p>我们可以看到在 <code>syncNodes</code> 方法中 <code>dsc.podControl.CreatePods</code> 之前，除了将原有的所有 template 属性 <code>DeepCopy</code> 了一份之外，单独处理了 <code>Affinity</code> （亲和性）并且处理的条件是什么呢？也就是 <code>ReplaceDaemonSetPodNodeNameNodeAffinity</code> 的第二个参数</p><figure class="highlight go"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line"><span class="function"><span class="keyword">func</span> <span class="title">ReplaceDaemonSetPodNodeNameNodeAffinity</span><span class="params">(affinity *v1.Affinity, nodename <span class="type">string</span>)</span></span> *v1.Affinity &#123;</span><br></pre></td></tr></table></figure><p><code>nodename</code> 破案了~ 所以，其实 DaemonSet 就是靠着来实现的，其他都是浮云，本质其实挺简单的。其实复杂的部分都给调度器了。</p><h3 id="问题-3-1"><a href="#问题-3-1" class="headerlink" title="问题 3"></a>问题 3</h3><blockquote><p>为了保证 pod 的关系和数量，我会猜测 DaemonSet 可能需要存 node 和 pod 的对应关系，如果有，是存在了哪里？</p></blockquote><p>这是一个很容易被疑惑和误导的问题，其实有了问题 2 做铺垫，这个问题也就能瞥见一点了。如果没有看过源码，你或许就可能会想，DaemonSet 应该存储了节点和 pod 的对应关系，方便在选择的时候选择合适的节点，并且当新来的时候可以确认当前没有 pod 的节点是哪一个。而事实并不是这样。DaemonSet 并不会保存这样的对应关系。有一个显然的理由是，在问题 2 中我们已经看到，pod 的调度完全是依靠调度器去完成的，控制器仅仅只是描述信息罢了，最终 pod 会调度到哪里其实并不归他管。</p><p>但是，DaemonSet 也必须要知道这个对应关系，没有这个关系，无论是后续更新还是本身的状态变化都需要依赖这个部分。于是乎，我们可以在 <code>rollingUpdate</code> 的时候发现它是如何操作的。</p><figure class="highlight go"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment">// pkg/controller/daemon/update.go:42</span></span><br><span class="line"><span class="comment">// rollingUpdate identifies the set of old pods to delete, or additional pods to create on nodes,</span></span><br><span class="line"><span class="comment">// remaining within the constraints imposed by the update strategy.</span></span><br><span class="line"><span class="function"><span class="keyword">func</span> <span class="params">(dsc *DaemonSetsController)</span></span> rollingUpdate(ctx context.Context, ds *apps.DaemonSet, nodeList []*v1.Node, hash <span class="type">string</span>) <span class="type">error</span> &#123;</span><br><span class="line">logger := klog.FromContext(ctx)</span><br><span class="line">nodeToDaemonPods, err := dsc.getNodesToDaemonPods(ctx, ds, <span class="literal">false</span>)</span><br><span class="line"><span class="keyword">if</span> err != <span class="literal">nil</span> &#123;</span><br><span class="line"><span class="keyword">return</span> fmt.Errorf(<span class="string">&quot;couldn&#x27;t get node to daemon pod mapping for daemon set %q: %v&quot;</span>, ds.Name, err)</span><br><span class="line">&#125;</span><br><span class="line">maxSurge, maxUnavailable, desiredNumberScheduled, err := dsc.updatedDesiredNodeCounts(ctx, ds, nodeList, nodeToDaemonPods)</span><br></pre></td></tr></table></figure><p><code>rollingUpdate</code> 方法显然是用于执行 DaemonSet 滚动更新的时候用的，也就是 pod 不断更新的过程。而这个方法本身是用来计算出需要更新哪些 pod ，哪一些要删，哪一些要新增。具体就不再展开。关键是这个部分</p><figure class="highlight go"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line">nodeToDaemonPods, err := dsc.getNodesToDaemonPods(ctx, ds, <span class="literal">false</span>)</span><br></pre></td></tr></table></figure><p><code>getNodesToDaemonPods</code> 返回了一个 map，<code>nodeToDaemonPods</code>，key 是 <code>NodeName</code> 而 value 则是对应的 pod 列表。内部的实现其实也非常简单。</p><figure class="highlight go"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment">// pkg/controller/daemon/daemon_controller.go:755</span></span><br><span class="line"><span class="function"><span class="keyword">func</span> <span class="params">(dsc *DaemonSetsController)</span></span> getNodesToDaemonPods(ctx context.Context, ds *apps.DaemonSet, includeDeletedTerminal <span class="type">bool</span>) (<span class="keyword">map</span>[<span class="type">string</span>][]*v1.Pod, <span class="type">error</span>) &#123;</span><br><span class="line">claimedPods, err := dsc.getDaemonPods(ctx, ds)</span><br><span class="line"><span class="keyword">if</span> err != <span class="literal">nil</span> &#123;</span><br><span class="line"><span class="keyword">return</span> <span class="literal">nil</span>, err</span><br><span class="line">&#125;</span><br><span class="line"><span class="comment">// Group Pods by Node name.</span></span><br><span class="line">nodeToDaemonPods := <span class="built_in">make</span>(<span class="keyword">map</span>[<span class="type">string</span>][]*v1.Pod)</span><br><span class="line">logger := klog.FromContext(ctx)</span><br><span class="line"><span class="keyword">for</span> _, pod := <span class="keyword">range</span> claimedPods &#123;</span><br><span class="line"><span class="keyword">if</span> !includeDeletedTerminal &amp;&amp; podutil.IsPodTerminal(pod) &amp;&amp; pod.DeletionTimestamp != <span class="literal">nil</span> &#123;</span><br><span class="line"><span class="comment">// This Pod has a finalizer or is already scheduled for deletion from the</span></span><br><span class="line"><span class="comment">// store by the kubelet or the Pod GC. The DS controller doesn&#x27;t have</span></span><br><span class="line"><span class="comment">// anything else to do with it.</span></span><br><span class="line"><span class="keyword">continue</span></span><br><span class="line">&#125;</span><br><span class="line">nodeName, err := util.GetTargetNodeName(pod)</span><br><span class="line"><span class="keyword">if</span> err != <span class="literal">nil</span> &#123;</span><br><span class="line">logger.V(<span class="number">4</span>).Info(<span class="string">&quot;Failed to get target node name of Pod in DaemonSet&quot;</span>,</span><br><span class="line"><span class="string">&quot;pod&quot;</span>, klog.KObj(pod), <span class="string">&quot;daemonset&quot;</span>, klog.KObj(ds))</span><br><span class="line"><span class="keyword">continue</span></span><br><span class="line">&#125;</span><br><span class="line"></span><br><span class="line">nodeToDaemonPods[nodeName] = <span class="built_in">append</span>(nodeToDaemonPods[nodeName], pod)</span><br><span class="line">&#125;</span><br><span class="line"></span><br><span class="line"><span class="keyword">return</span> nodeToDaemonPods, <span class="literal">nil</span></span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure><p>可以看到就是将 DaemonPods 拿出来，通过 <code>GetTargetNodeName</code> 拿到对应的 nodeName 然后分好就可以了。其中内部就是通过 <code>dsc.podLister.Pods(ds.Namespace).List(labels.Everything())</code> 来完成的。总结一下，就是其实当时直接查出来的。</p><p>看到这里你也许会好奇为什么我会单独把这个部分拿出来看，而不是去看其他创建或者计算的过程。首先我会觉得其他的部分可以算是 “业务” 它有着自己的逻辑，按部就班，并且正确计算条件即可。而之所以看这部分是想强化一下我们对于控制循环的理解，我们在这个大章节最开始就提到了它。控制循环的本质是根据当前状态和期望状态不一致，从而触发改变，让目标状态最终能变成期望状态，而关键在于是 ”当前状态“，这个状态可能会由于整个集群任何操作变化的改变而变动，所以只有当下去看，才能知道目前的状态是什么样的，改变的因素太多了。</p><h2 id="总结提升"><a href="#总结提升" class="headerlink" title="总结提升"></a>总结提升</h2><p>这一节我们看了 DaemonPods 的源码部分，如果你已经可以自己在源码中寻找到前面提出问题的答案，那么我相信对于各种其他的对象你也可以轻车熟路了。并且看到这里，你应该就能感觉到，其实看源码本身并不难，找准目标一步步往下走就可以了，虽然代码量很多，但是设计绝大多数其实都是相通的，一个类型看一个，都能举一反三。相信你渐渐能有这样的体会。</p><h3 id="编码上"><a href="#编码上" class="headerlink" title="编码上"></a>编码上</h3><p>最后，在编码上，我们可以总结一个小点。</p><figure class="highlight go"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br><span class="line">29</span><br><span class="line">30</span><br><span class="line">31</span><br><span class="line">32</span><br></pre></td><td class="code"><pre><span class="line"><span class="function"><span class="keyword">func</span> <span class="params">(dsc *DaemonSetsController)</span></span> syncNodes(ctx context.Context, ds *apps.DaemonSet, podsToDelete, nodesNeedingDaemonPods []<span class="type">string</span>, hash <span class="type">string</span>) <span class="type">error</span> &#123;</span><br><span class="line"><span class="comment">// ...</span></span><br><span class="line"></span><br><span class="line">errCh := <span class="built_in">make</span>(<span class="keyword">chan</span> <span class="type">error</span>, createDiff+deleteDiff)</span><br><span class="line">createWait := sync.WaitGroup&#123;&#125;</span><br><span class="line"><span class="comment">// ...</span></span><br><span class="line">batchSize := integer.IntMin(createDiff, controller.SlowStartInitialBatchSize)</span><br><span class="line"><span class="keyword">for</span> pos := <span class="number">0</span>; createDiff &gt; pos; batchSize, pos = integer.IntMin(<span class="number">2</span>*batchSize, createDiff-(pos+batchSize)), pos+batchSize &#123;</span><br><span class="line">errorCount := <span class="built_in">len</span>(errCh)</span><br><span class="line">createWait.Add(batchSize)</span><br><span class="line"><span class="keyword">for</span> i := pos; i &lt; pos+batchSize; i++ &#123;</span><br><span class="line"><span class="keyword">go</span> <span class="function"><span class="keyword">func</span><span class="params">(ix <span class="type">int</span>)</span></span> &#123;</span><br><span class="line"><span class="keyword">defer</span> createWait.Done()</span><br><span class="line"></span><br><span class="line"><span class="comment">// ...</span></span><br><span class="line"><span class="keyword">if</span> err != <span class="literal">nil</span> &#123;</span><br><span class="line">errCh &lt;- err</span><br><span class="line">&#125;</span><br><span class="line">&#125;(i)</span><br><span class="line">&#125;</span><br><span class="line">createWait.Wait()</span><br><span class="line"><span class="comment">// ...</span></span><br><span class="line">&#125;</span><br><span class="line"></span><br><span class="line"><span class="comment">// ...</span></span><br><span class="line">errors := []<span class="type">error</span>&#123;&#125;</span><br><span class="line"><span class="built_in">close</span>(errCh)</span><br><span class="line"><span class="keyword">for</span> err := <span class="keyword">range</span> errCh &#123;</span><br><span class="line">errors = <span class="built_in">append</span>(errors, err)</span><br><span class="line">&#125;</span><br><span class="line"><span class="keyword">return</span> utilerrors.NewAggregate(errors)</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure><p>在我们前面看到的 <code>syncNodes</code> 方法中有一个非常标准的利用 WaitGroup 去并发处理任务并等待任务处理完毕，同时利用 <code>chan error</code> 将错误统一发送到 channel 最后一并处理合并的最佳实践。这一部分的编码我相信很多地方都是可以使用的，希望你也能学到。</p>]]>
    </content>
    
    
      
      
        
        
    <summary type="html">&lt;blockquote&gt;
&lt;p&gt;📢 注意，该文本非最终版本，正在更新中，版权所有，请勿转载！！&lt;/p&gt;
&lt;/blockquote&gt;
&lt;h2 id=&quot;前言&quot;&gt;&lt;a href=&quot;#前言&quot; class=&quot;headerlink&quot;</summary>
        
      
    
    
    
    <category term="kubernetes" scheme="https://www.linkinstars.com/categories/kubernetes/"/>
    
    
    <category term="kubernetes" scheme="https://www.linkinstars.com/tags/kubernetes/"/>
    
  </entry>
  
  <entry>
    <title>OrbStack 配置国内镜像加速</title>
    <link href="https://www.linkinstars.com/post/c1aaa2.html"/>
    <id>https://www.linkinstars.com/post/c1aaa2.html</id>
    <published>2024-08-31T16:00:00.000Z</published>
    <updated>2024-09-06T04:20:36.490Z</updated>
    
    <content type="html">
    <![CDATA[<a href="https://www.linkinstars.com/post/c1aaa2.html">RSS 阅读体验可能不太好，若喜欢本文请点此跳转原文查看~</a><br><br>]]>
    <![CDATA[<h2 id="前言"><a href="#前言" class="headerlink" title="前言"></a>前言</h2><p>使用 docker 我们知道可以通过配置 <code>/etc/docker/daemon.json</code> 的 <code>registry-mirrors</code> 来实现，但是由于 docker 桌面版本在 macOS 上实在不好玩，还是 orbstack 好用，占用资源更少。那么使用 OrbStack 如何配置国内镜像加速器呢？其实很简单。</p><h2 id="配置"><a href="#配置" class="headerlink" title="配置"></a>配置</h2><p>创建并编辑 <code>~/.orbstack/config/docker.json</code></p><figure class="highlight json"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br></pre></td><td class="code"><pre><span class="line"><span class="punctuation">&#123;</span></span><br><span class="line">    <span class="attr">&quot;registry-mirrors&quot;</span><span class="punctuation">:</span> <span class="punctuation">[</span></span><br><span class="line">        <span class="string">&quot;https://dockerproxy.com&quot;</span><span class="punctuation">,</span></span><br><span class="line">        <span class="string">&quot;https://docker.mirrors.ustc.edu.cn&quot;</span><span class="punctuation">,</span></span><br><span class="line">        <span class="string">&quot;https://docker.nju.edu.cn&quot;</span></span><br><span class="line">    <span class="punctuation">]</span></span><br><span class="line"><span class="punctuation">&#125;</span></span><br></pre></td></tr></table></figure><p>目前可用的镜像：<a href="https://gist.github.com/y0ngb1n/7e8f16af3242c7815e7ca2f0833d3ea6">https://gist.github.com/y0ngb1n/7e8f16af3242c7815e7ca2f0833d3ea6</a></p><h2 id="验证"><a href="#验证" class="headerlink" title="验证"></a>验证</h2><p>重启 orbstack，之后命令行执行 <code>docker info</code>，如果可以看到了 Registry Mirrors 里面有你的配置就对了</p><figure class="highlight shell"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br></pre></td><td class="code"><pre><span class="line">Registry Mirrors:</span><br><span class="line">  https://dockerproxy.com</span><br></pre></td></tr></table></figure>]]>
    </content>
    
    
      
      
        
        
    <summary type="html">&lt;h2 id=&quot;前言&quot;&gt;&lt;a href=&quot;#前言&quot; class=&quot;headerlink&quot; title=&quot;前言&quot;&gt;&lt;/a&gt;前言&lt;/h2&gt;&lt;p&gt;使用 docker 我们知道可以通过配置 &lt;code&gt;/etc/docker/daemon.json&lt;/code&gt; 的</summary>
        
      
    
    
    
    <category term="macos-hint" scheme="https://www.linkinstars.com/categories/macos-hint/"/>
    
    
    <category term="OrbStack" scheme="https://www.linkinstars.com/tags/OrbStack/"/>
    
  </entry>
  
  <entry>
    <title>《一起读 kubernetes 源码》statefulset 的更新有何不同</title>
    <link href="https://www.linkinstars.com/post/2c457b03.html"/>
    <id>https://www.linkinstars.com/post/2c457b03.html</id>
    <published>2024-08-30T16:00:00.000Z</published>
    <updated>2024-08-29T10:53:24.706Z</updated>
    
    <content type="html">
    <![CDATA[<a href="https://www.linkinstars.com/post/2c457b03.html">RSS 阅读体验可能不太好，若喜欢本文请点此跳转原文查看~</a><br><br>]]>
    <![CDATA[<blockquote><p>📢 注意，该文本非最终版本，正在更新中，版权所有，请勿转载！！</p></blockquote><h2 id="前言"><a href="#前言" class="headerlink" title="前言"></a>前言</h2><p>在前面我们已经看过了 deployment 和 replicaset 的实现，其实对于 k8s 中的对象已经有了一个基本的认识，其他的对象也都是在这个的基础之上有了不同的能力。而这一节我们来看看另一个常用的对象 statefulset。相对与 deployment 来说 statefulset 用的会更少，因为大部分应用都是无状态的，而有状态的数据类型的应用可能上 k8s 又少，要不就是接云厂商，要不就是独立部署。但是对于一些需要持久化配置或者数据的应用来说，配合 StorageClass 能让 StatefulSet 很好的帮助我们来部署这样类型的应用。</p><h2 id="前置知识"><a href="#前置知识" class="headerlink" title="前置知识"></a>前置知识</h2><ul><li>statefulset 的基本使用</li><li>statefulset 的更新过程</li><li>statefulset 的 partition 的作用</li></ul><h2 id="心路历程"><a href="#心路历程" class="headerlink" title="心路历程"></a>心路历程</h2><p>我们知道滚动更新的时候 statefulset 是一个一个的这里的实现与 deployment 有什么不一样的地方呢？这部分是今天的主角我们需要弄明白。而另一部分是有关于 <code>persistentVolumeClaimRetentionPolicy</code> 这个是 <code>v1.27</code> beta 的特性，用于控制是否删除以及如何删除 PVC，除了看原本的源码，这次我希望给你看一些不一样的，比如在 k8s 里面，对于新特性是如何引入和做判断的。</p><h2 id="码前提问"><a href="#码前提问" class="headerlink" title="码前提问"></a>码前提问</h2><ol><li>statefulset 滚动更新的实现与 deployment 有什么区别？</li><li>statefulset <code>persistentVolumeClaimRetentionPolicy</code> 是如何实现的</li></ol><h2 id="源码分析"><a href="#源码分析" class="headerlink" title="源码分析"></a>源码分析</h2><h3 id="寻码过程"><a href="#寻码过程" class="headerlink" title="寻码过程"></a>寻码过程</h3><p>这次我就不多说了，有了前面的经验，找到它易如反掌</p><figure class="highlight plaintext"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br></pre></td><td class="code"><pre><span class="line">kubernetes/pkg/controller/deployment</span><br><span class="line">kubernetes/pkg/controller/replicaset</span><br><span class="line">kubernetes/pkg/controller/statefulset</span><br></pre></td></tr></table></figure><p>而这一次提供另一种看源码的思路，类比。由于我们已经比较了解 deployment 的整体实现了，所以大部分相同的地方我们可以直接跳过，我们主要去寻找不一样的地方。</p><h3 id="结构-和-创建"><a href="#结构-和-创建" class="headerlink" title="结构 和 创建"></a>结构 和 创建</h3><figure class="highlight go"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment">// pkg/controller/statefulset/stateful_set.go:83</span></span><br><span class="line"><span class="function"><span class="keyword">func</span> <span class="title">NewStatefulSetController</span><span class="params">(</span></span></span><br><span class="line"><span class="params"><span class="function">ctx context.Context,</span></span></span><br><span class="line"><span class="params"><span class="function">podInformer coreinformers.PodInformer,</span></span></span><br><span class="line"><span class="params"><span class="function">setInformer appsinformers.StatefulSetInformer,</span></span></span><br><span class="line"><span class="params"><span class="function">pvcInformer coreinformers.PersistentVolumeClaimInformer,</span></span></span><br><span class="line"><span class="params"><span class="function">revInformer appsinformers.ControllerRevisionInformer,</span></span></span><br><span class="line"><span class="params"><span class="function">kubeClient clientset.Interface,</span></span></span><br><span class="line"><span class="params"><span class="function">)</span></span> *StatefulSetController &#123;</span><br></pre></td></tr></table></figure><p>具体结构和创建方法其实都不用贴，这里的入参就很能说明问题了。类比一下</p><ol><li>没有 <code>ReplicaSetInformer</code> 证明 pod 已经不是通过 <code>RS</code> 去控制了，而是直接给到了 <code>podInformer</code> 然后交给 <code>StatefulSetController</code> 的 <code>updatePod</code> 相关方法</li><li>多了 <code>PersistentVolumeClaimInformer</code> 和 <code>ControllerRevisionInformer</code> 由于我们知道 PVC 是什么东西，由于有状态，大多数情况下会用到 PV 和 PVC 所以启动的时候势必需要等待他们完成</li></ol><p>那接下来我们的思路就很明确了，我们需要去看 pod 更新的时候具体是如何操作的</p><h3 id="更新"><a href="#更新" class="headerlink" title="更新"></a>更新</h3><p>之前我们的路径还有印象对吧：<code>Run</code> -&gt; <code>worker</code> -&gt; <code>processNextWorkItem</code> -&gt; <code>syncHandler</code> 。</p><p>类比着我们很快能找到在 statefulset 里面也是类似的：<code>Run</code> -&gt; <code>worker</code> -&gt; <code>processNextWorkItem</code> -&gt; <code>sync</code> -&gt; <code>syncStatefulSet</code> -&gt; <code>UpdateStatefulSet</code> -&gt; <code>performUpdate</code> -&gt; <code>updateStatefulSet</code>。</p><p>而 <code>updateStatefulSet</code> 方法就是更新的关键了。</p><h4 id="策略-UpdateStrategy"><a href="#策略-UpdateStrategy" class="headerlink" title="策略 UpdateStrategy"></a>策略 UpdateStrategy</h4><p>RollingUpdateStatefulSetStrategyType 是 statefulset 的更新策略，就两种，非常简单</p><ul><li><code>RollingUpdate</code>，默认就是这个，滚动更新，一个好了接一个</li><li><code>OnDelete</code>，很简单，就是需要用户手动去删除才会更新</li></ul><p>由于 <code>OnDelete</code> 很少用到，所以可能被忽略。不过为什么要先知道策略呢？这里就要可以利用另外一个源码的阅读技巧了，合理利用枚举参数。</p><blockquote><p>利用固定的枚举参数，可以快速缩小源码的阅读内容，也可以快速定位目标</p></blockquote><h4 id="滚动更新"><a href="#滚动更新" class="headerlink" title="滚动更新"></a>滚动更新</h4><p>我们知道 <code>OnDelete</code> 是用户手动操作才会更新 <code>pod</code> ，<strong>那么源码里面必定需要判断这个状态，如果不是这个状态才会去操作 pod 主动去删除。</strong> 所以我们直接定位到 <code>updateStatefulSet</code> 的最后：</p><figure class="highlight go"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br><span class="line">29</span><br><span class="line">30</span><br><span class="line">31</span><br><span class="line">32</span><br><span class="line">33</span><br><span class="line">34</span><br><span class="line">35</span><br><span class="line">36</span><br><span class="line">37</span><br><span class="line">38</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment">// pkg/controller/statefulset/stateful_set_control.go:658</span></span><br><span class="line"><span class="comment">// for the OnDelete strategy we short circuit. Pods will be updated when they are manually deleted.</span></span><br><span class="line"><span class="keyword">if</span> set.Spec.UpdateStrategy.Type == apps.OnDeleteStatefulSetStrategyType &#123;</span><br><span class="line"><span class="keyword">return</span> &amp;status, <span class="literal">nil</span></span><br><span class="line">&#125;</span><br><span class="line"></span><br><span class="line"><span class="comment">// ...</span></span><br><span class="line"></span><br><span class="line"><span class="comment">// we compute the minimum ordinal of the target sequence for a destructive update based on the strategy.</span></span><br><span class="line">updateMin := <span class="number">0</span></span><br><span class="line"><span class="keyword">if</span> set.Spec.UpdateStrategy.RollingUpdate != <span class="literal">nil</span> &#123;</span><br><span class="line">updateMin = <span class="type">int</span>(*set.Spec.UpdateStrategy.RollingUpdate.Partition)</span><br><span class="line">&#125;</span><br><span class="line"><span class="comment">// we terminate the Pod with the largest ordinal that does not match the update revision.</span></span><br><span class="line"><span class="keyword">for</span> target := <span class="built_in">len</span>(replicas) - <span class="number">1</span>; target &gt;= updateMin; target-- &#123;</span><br><span class="line"></span><br><span class="line"><span class="comment">// delete the Pod if it is not already terminating and does not match the update revision.</span></span><br><span class="line"><span class="keyword">if</span> getPodRevision(replicas[target]) != updateRevision.Name &amp;&amp; !isTerminating(replicas[target]) &#123;</span><br><span class="line">logger.V(<span class="number">2</span>).Info(<span class="string">&quot;Pod of StatefulSet is terminating for update&quot;</span>,</span><br><span class="line"><span class="string">&quot;statefulSet&quot;</span>, klog.KObj(set), <span class="string">&quot;pod&quot;</span>, klog.KObj(replicas[target]))</span><br><span class="line"><span class="keyword">if</span> err := ssc.podControl.DeleteStatefulPod(set, replicas[target]); err != <span class="literal">nil</span> &#123;</span><br><span class="line"><span class="keyword">if</span> !errors.IsNotFound(err) &#123;</span><br><span class="line"><span class="keyword">return</span> &amp;status, err</span><br><span class="line">&#125;</span><br><span class="line">&#125;</span><br><span class="line">status.CurrentReplicas--</span><br><span class="line"><span class="keyword">return</span> &amp;status, err</span><br><span class="line">&#125;</span><br><span class="line"></span><br><span class="line"><span class="comment">// wait for unhealthy Pods on update</span></span><br><span class="line"><span class="keyword">if</span> !isHealthy(replicas[target]) &#123;</span><br><span class="line">logger.V(<span class="number">4</span>).Info(<span class="string">&quot;StatefulSet is waiting for Pod to update&quot;</span>,</span><br><span class="line"><span class="string">&quot;statefulSet&quot;</span>, klog.KObj(set), <span class="string">&quot;pod&quot;</span>, klog.KObj(replicas[target]))</span><br><span class="line"><span class="keyword">return</span> &amp;status, <span class="literal">nil</span></span><br><span class="line">&#125;</span><br><span class="line"></span><br><span class="line">&#125;</span><br><span class="line"><span class="keyword">return</span> &amp;status, <span class="literal">nil</span></span><br></pre></td></tr></table></figure><p>可以明确看到，如果是 <code>OnDelete</code> 状态那就直接返回了，那也就是说下面就是操作 pod 了。果然，下面的逻辑其实非常简单，其中有三个注意点：</p><ol><li>将最大的那个 pod 控制去 <code>DeleteStatefulPod</code> 然后直接返回了</li><li><code>if !isHealthy(replicas[target]) &#123;</code> 也就是：<strong>当一个 pod 正在更新的时候，也会直接返回</strong>。也就是 statefulset 必然是一个好了再下一个更新的</li><li><code>Partition</code> 是用来做 金丝雀 发布的，你应该有所了解，也是在这里处理的，只有序号 ≥ partition 的才会更新</li></ol><p>其实更新部分我觉得最重要的部分这样就被解决了，我们通过枚举的技巧可以快速将一个 200+ 行的函数快速定位到了自己所需要的部分，当然如果你对前面对于 pod 排序创建等操作，有想要了解的回过头去看另一半就可以了，此时你就可以完全不管删除的逻辑了，上半部分肯定是在处理删除之前的逻辑，那么你的方向会更清晰。</p><h3 id="persistentVolumeClaimRetentionPolicy"><a href="#persistentVolumeClaimRetentionPolicy" class="headerlink" title="persistentVolumeClaimRetentionPolicy"></a>persistentVolumeClaimRetentionPolicy</h3><p>在以前，statefulset 被删除之后 PVC 通常是不受影响的，也就是 <code>Retain</code>，而还可以配置 <code>Delete</code> 也就是删除。并且 <code>persistentVolumeClaimRetentionPolicy</code> 可以支持 <code>whenDeleted</code> 和 <code>whenScaled</code> 就是在不同场景下支持不同的控制策略。比如当 pod 被删除时是 PVC 是保留的，但 缩减(scaled) 的时候删除。</p><p>这里特性本身不是特别重要，重要的是，我想让你看下对于新特性的引入，在 k8s 中是如何做判断的。</p><figure class="highlight go"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment">// pkg/controller/statefulset/stateful_set_control.go:387</span></span><br><span class="line"><span class="comment">// If we find a Pod that has not been created we create the Pod</span></span><br><span class="line"><span class="keyword">if</span> !isCreated(replicas[i]) &#123;</span><br><span class="line"><span class="keyword">if</span> utilfeature.DefaultFeatureGate.Enabled(features.StatefulSetAutoDeletePVC) &#123;</span><br><span class="line"><span class="keyword">if</span> isStale, err := ssc.podControl.PodClaimIsStale(set, replicas[i]); err != <span class="literal">nil</span> &#123;</span><br><span class="line"><span class="keyword">return</span> <span class="literal">true</span>, err</span><br><span class="line">&#125; <span class="keyword">else</span> <span class="keyword">if</span> isStale &#123;</span><br><span class="line"><span class="comment">// If a pod has a stale PVC, no more work can be done this round.</span></span><br><span class="line"><span class="keyword">return</span> <span class="literal">true</span>, err</span><br><span class="line">&#125;</span><br><span class="line">&#125;</span><br><span class="line"><span class="keyword">if</span> err := ssc.podControl.CreateStatefulPod(ctx, set, replicas[i]); err != <span class="literal">nil</span> &#123;</span><br><span class="line"><span class="keyword">return</span> <span class="literal">true</span>, err</span><br><span class="line">&#125;</span><br><span class="line"><span class="keyword">if</span> monotonic &#123;</span><br><span class="line"><span class="comment">// if the set does not allow bursting, return immediately</span></span><br><span class="line"><span class="keyword">return</span> <span class="literal">true</span>, <span class="literal">nil</span></span><br><span class="line">&#125;</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure><p>上面这一部分是在 <code>updateStatefulSet</code> 的 <code>processReplica</code> 根据 pod 不同状态执行不同操作，其中我们可以看到，其实并不复杂，就是通过了 <code>utilfeature.DefaultFeatureGate</code> 的 <code>Enabled</code> 方法来得到当前所需要的这个 feature 是否被开启来，如果开启了，就可以执行下方的判断。而 <code>utilfeature.DefaultFeatureGate</code> 本质也就是一个 map ，存储了所有的 feat，而 features 枚举了所以的特性，其中有非常详细的版本注释。</p><p>其实和一般处理的方式没啥区别，如果让我们来写也是一样的，就是通过全局变量来注册所有特性的状态而已。而判断的时候也就是判断一下里面开没开。当然这可能确实有点 ”散弹修改“ 的味道，但是由于特性的目标不会多，所以面积不会广，全局也随时能用，无耦合，可以学习。</p><figure class="highlight go"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment">// pkg/controller/statefulset/stateful_pod_control.go:265</span></span><br><span class="line"><span class="function"><span class="keyword">func</span> <span class="params">(spc *StatefulPodControl)</span></span> PodClaimIsStale(set *apps.StatefulSet, pod *v1.Pod) (<span class="type">bool</span>, <span class="type">error</span>) &#123;</span><br><span class="line">policy := getPersistentVolumeClaimRetentionPolicy(set)</span><br><span class="line"><span class="keyword">if</span> policy.WhenScaled == apps.RetainPersistentVolumeClaimRetentionPolicyType &#123;</span><br><span class="line"><span class="comment">// PVCs are meant to be reused and so can&#x27;t be stale.</span></span><br><span class="line"><span class="keyword">return</span> <span class="literal">false</span>, <span class="literal">nil</span></span><br><span class="line">&#125;</span><br><span class="line"><span class="keyword">for</span> _, claim := <span class="keyword">range</span> getPersistentVolumeClaims(set, pod) &#123;</span><br><span class="line">pvc, err := spc.objectMgr.GetClaim(claim.Namespace, claim.Name)</span><br><span class="line"><span class="keyword">switch</span> &#123;</span><br><span class="line"><span class="keyword">case</span> apierrors.IsNotFound(err):</span><br><span class="line"><span class="comment">// If the claim doesn&#x27;t exist yet, it can&#x27;t be stale.</span></span><br><span class="line"><span class="keyword">continue</span></span><br><span class="line"><span class="keyword">case</span> err != <span class="literal">nil</span>:</span><br><span class="line"><span class="keyword">return</span> <span class="literal">false</span>, err</span><br><span class="line"><span class="keyword">case</span> err == <span class="literal">nil</span>:</span><br><span class="line"><span class="keyword">if</span> hasStaleOwnerRef(pvc, pod, podKind) &#123;</span><br><span class="line"><span class="keyword">return</span> <span class="literal">true</span>, <span class="literal">nil</span></span><br><span class="line">&#125;</span><br><span class="line">&#125;</span><br><span class="line">&#125;</span><br><span class="line"><span class="keyword">return</span> <span class="literal">false</span>, <span class="literal">nil</span></span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure><p>当然后面其实就是在 <code>PodClaimIsStale</code> 中判断 是 <code>Retain</code> 还是 <code>Delete</code> 了。</p><h2 id="码后解答"><a href="#码后解答" class="headerlink" title="码后解答"></a>码后解答</h2><ol><li>statefulset 滚动更新的实现与 deployment 有什么区别？<ol><li>关键在于顺序(有序)和个数(一次一个)</li></ol></li><li>statefulset <code>persistentVolumeClaimRetentionPolicy</code> 是如何实现的？<ol><li>很简单，通过 <code>utilfeature.DefaultFeatureGate</code> 一个全局变量来进行判断</li></ol></li></ol><h2 id="总结提升"><a href="#总结提升" class="headerlink" title="总结提升"></a>总结提升</h2><p>可以看到，由于我们之前有了其他类似的源码经验，其实对于整体过程已经有了把握，很多地方就没必要再去仔仔细细一步步推敲了，因为实现都是类似的，我们只需要抓住不同，类比即可。找到不同的地方，看自己关心的地方，就能快速知道源码里面做的事情是什么。<strong>只要从大方向有了把握，之后有问题你就可以迅速定位到这个问题可能出现的原因，以及有寻找的思路了。</strong></p><h3 id="编码上"><a href="#编码上" class="headerlink" title="编码上"></a>编码上</h3><p>对于项目内新特性的引入完全可以参考 <code>utilfeature.DefaultFeatureGate</code> 的设计，在引入使用 beta 一段时间，在后续的正式版本中上线。一个 map + 一个 if 的事。</p>]]>
    </content>
    
    
      
      
        
        
    <summary type="html">&lt;blockquote&gt;
&lt;p&gt;📢 注意，该文本非最终版本，正在更新中，版权所有，请勿转载！！&lt;/p&gt;
&lt;/blockquote&gt;
&lt;h2 id=&quot;前言&quot;&gt;&lt;a href=&quot;#前言&quot; class=&quot;headerlink&quot;</summary>
        
      
    
    
    
    <category term="kubernetes" scheme="https://www.linkinstars.com/categories/kubernetes/"/>
    
    
    <category term="kubernetes" scheme="https://www.linkinstars.com/tags/kubernetes/"/>
    
  </entry>
  
  <entry>
    <title>MacOS 下如何安装 gnu 版本的 sed</title>
    <link href="https://www.linkinstars.com/post/51c5a57b.html"/>
    <id>https://www.linkinstars.com/post/51c5a57b.html</id>
    <published>2024-08-19T16:00:00.000Z</published>
    <updated>2024-08-29T10:52:38.684Z</updated>
    
    <content type="html">
    <![CDATA[<a href="https://www.linkinstars.com/post/51c5a57b.html">RSS 阅读体验可能不太好，若喜欢本文请点此跳转原文查看~</a><br><br>]]>
    <![CDATA[<h2 id="前言"><a href="#前言" class="headerlink" title="前言"></a>前言</h2><p>在 MacOS 下使用 sed -i 就会出现类似下面的错误</p><figure class="highlight bash"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br></pre></td><td class="code"><pre><span class="line">sed: 1: <span class="string">&quot;...&quot;</span>: <span class="built_in">command</span> c expects \ followed by text</span><br><span class="line">sed: 1: <span class="string">&quot;...&quot;</span>: <span class="built_in">command</span> i expects \ followed by text</span><br></pre></td></tr></table></figure><p>原因是由于 MacOS 下默认的 sed 和 Linux 下是不一样的，导致 <code>-i</code> 无法正确识别，通常的做法是安装一个 <code>gnu-sed</code> 已替换原本的 <code>sed</code> 命令</p><h2 id="安装步骤"><a href="#安装步骤" class="headerlink" title="安装步骤"></a>安装步骤</h2><figure class="highlight bash"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment"># 查询一下</span></span><br><span class="line"><span class="built_in">which</span> sed</span><br><span class="line"></span><br><span class="line"><span class="comment"># 安装</span></span><br><span class="line">brew install gnu-sed</span><br><span class="line"></span><br><span class="line"><span class="comment"># 查看</span></span><br><span class="line">brew info gnu-sed</span><br><span class="line"></span><br><span class="line">==&gt; gnu-sed: stable 4.9 (bottled)</span><br><span class="line">GNU implementation of the famous stream editor</span><br><span class="line">https://www.gnu.org/software/sed/</span><br><span class="line">Conflicts with:</span><br><span class="line">  ssed (because both install share/info/sed.info)</span><br><span class="line">Installed</span><br><span class="line">/opt/homebrew/Cellar/gnu-sed/4.9 (13 files, 616.5KB) *</span><br><span class="line">  Poured from bottle using the formulae.brew.sh API on 2024-08-23 at 16:14:00</span><br><span class="line">From: https://github.com/Homebrew/homebrew-core/blob/HEAD/Formula/g/gnu-sed.rb</span><br><span class="line">License: GPL-3.0-or-later</span><br><span class="line">==&gt; Caveats</span><br><span class="line">GNU <span class="string">&quot;sed&quot;</span> has been installed as <span class="string">&quot;gsed&quot;</span>.</span><br><span class="line">If you need to use it as <span class="string">&quot;sed&quot;</span>, you can add a <span class="string">&quot;gnubin&quot;</span> directory</span><br><span class="line">to your PATH from your bashrc like:</span><br><span class="line"></span><br><span class="line">    PATH=<span class="string">&quot;/opt/homebrew/opt/gnu-sed/libexec/gnubin:<span class="variable">$PATH</span>&quot;</span></span><br><span class="line"></span><br><span class="line"></span><br></pre></td></tr></table></figure><h2 id="配置环境变量"><a href="#配置环境变量" class="headerlink" title="配置环境变量"></a>配置环境变量</h2><p>将 <code>PATH=&quot;/opt/homebrew/opt/gnu-sed/libexec/gnubin:$PATH&quot;</code> 部分配置到 <code>~/.zshrc</code> 里面就可以了，别忘记 <code>source</code> 让它生效哦。</p><h2 id="验证"><a href="#验证" class="headerlink" title="验证"></a>验证</h2><figure class="highlight bash"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment"># 记得验证一下</span></span><br><span class="line"><span class="built_in">which</span> sed</span><br><span class="line"></span><br><span class="line"><span class="comment"># 当然你也可以直接使用 `gsed`</span></span><br></pre></td></tr></table></figure>]]>
    </content>
    
    
      
      
        
        
    <summary type="html">&lt;h2 id=&quot;前言&quot;&gt;&lt;a href=&quot;#前言&quot; class=&quot;headerlink&quot; title=&quot;前言&quot;&gt;&lt;/a&gt;前言&lt;/h2&gt;&lt;p&gt;在 MacOS 下使用 sed -i 就会出现类似下面的错误&lt;/p&gt;
&lt;figure class=&quot;highlight</summary>
        
      
    
    
    
    <category term="macos-hint" scheme="https://www.linkinstars.com/categories/macos-hint/"/>
    
    
    <category term="sed" scheme="https://www.linkinstars.com/tags/sed/"/>
    
  </entry>
  
  <entry>
    <title>浅析 Rust 所有权设计</title>
    <link href="https://www.linkinstars.com/post/d17fc0bb.html"/>
    <id>https://www.linkinstars.com/post/d17fc0bb.html</id>
    <published>2024-08-14T16:00:00.000Z</published>
    <updated>2024-08-22T09:20:39.845Z</updated>
    
    <content type="html">
    <![CDATA[<a href="https://www.linkinstars.com/post/d17fc0bb.html">RSS 阅读体验可能不太好，若喜欢本文请点此跳转原文查看~</a><br><br>]]>
    <![CDATA[<h2 id="前言"><a href="#前言" class="headerlink" title="前言"></a>前言</h2><p>市面上的 Rust 书已经把基础说的足够好了，我没必要再搬砖，而是尝试从一些个人的角度来重新审视其中的一些设计。Rust 我认为最为关键的设计：所有权和借用。我们都知道 Rust 和 Go、Java 相比最大的不同就是没有 GC，那么没有 GC 的设计<strong>需要付出一定的代价</strong>，这个代价一部分就体现在今天要说的这两个设计中。</p><h2 id="代价-1-所有权"><a href="#代价-1-所有权" class="headerlink" title="代价 1: 所有权"></a>代价 1: 所有权</h2><h3 id="三规"><a href="#三规" class="headerlink" title="三规"></a>三规</h3><blockquote><ol><li>每一个值都被一个变量所拥有 Each value in Rust has a variable that’s called its owner</li><li>一个值同时只能被一个变量所拥有 There can only be one owner at a time</li><li>当所有者离开作用域时，这个值将被丢弃 When the owner goes out of scope, the value will be dropped</li></ol></blockquote><p>很多基础都会提到这三个规定，非常重要。理解它就能理解所有权。而反过来，当我看完之后，反过来，其实可以从中推导它的设计。</p><h3 id="问答推导"><a href="#问答推导" class="headerlink" title="问答推导"></a>问答推导</h3><blockquote><p>我还是习惯称呼为对象，所以我下面以对象称呼。</p></blockquote><p>提问：GC 的目标是什么？<br>回答：垃圾对象，无用的对象</p><p>提问：垃圾怎么产生的？<br>回答：对象被创建了，但是后来它没用了</p><p>提问：怎么知道它没用了？<br>回答：我怎么知道？！😡 不就是没人用了么？</p><p>很好，其实到此，我们就已经知道了要解决的问题是什么了。接下来让我们一步步从后往前推导：</p><ol><li>我不想要 GC，那我首先必须能有办法丢弃掉不用的垃圾对象，我觉得 “作用域” 是一个很不错的分界点，出了作用域我就把对象给扔掉 (规则 3)</li><li>当一个对象有多个用户(所有者&#x2F;变量) 如：a1 &#x3D; s; a2 &#x3D; s; 那么此时离开作用域时，我会同时想要释放两次，并且这两次还是同一个对象，这不合理，我得限制让一个值只能有一个变量 (规则 12)</li></ol><h3 id="结论"><a href="#结论" class="headerlink" title="结论"></a>结论</h3><p>这样推导下来其实 Rust，也没有特别复杂，它只不过将 GC 的回收工作放到了作用域结束，强制在离开时回收掉。</p><p>而原本 GC 要确定一个对象是否已经为垃圾，常用的方式是引用计数对吧？而 Rust 直接强制只有一个所有者，你没了就是你没了，和别人无关。直接简化了整个逻辑。</p><h2 id="代价-2-借用"><a href="#代价-2-借用" class="headerlink" title="代价 2: 借用"></a>代价 2: 借用</h2><p>由于所有权的<strong>唯一性</strong>，那我们在传递的时候势必就需要做一个操作，那就是<strong>转移</strong>。一直转来转去很麻烦，有没有别的办法呢？第一想法其实就是传指针，也就是搞引用。所以，Rust 弄了一个借用的功能出来。（注意在 Rust 中，“借用”和“引用”是一个概念）</p><h3 id="借用不是就有多个人在用了吗？"><a href="#借用不是就有多个人在用了吗？" class="headerlink" title="借用不是就有多个人在用了吗？"></a>借用不是就有多个人在用了吗？</h3><p>这是我们遇到的第一个疑问，如果一个对象有多个引用，这不是又回到老路上，并且和所有权的规则向违背了吗？非也非也。关键点来了：</p><blockquote><p>所有权，所有权，是证明东西是我的；借用，借用，东西还是你的，我借一下而已，还会还给你的。</p></blockquote><p>这也是为什么称为借用(Borrow) 的原因，<strong>其实所有权本身并没有发生转移</strong>。</p><h3 id="借用得有规则"><a href="#借用得有规则" class="headerlink" title="借用得有规则"></a>借用得有规则</h3><p>借用也不能无条件的乱用，要有规则，而它的规则，我们并不用死记。懂并发吗？读写锁应该有了解吧？没错和读写锁的逻辑其实是一致的。</p><ol><li>能有多个不可变的引用（能同时多读）</li><li>可变引用与不可变引用不能同时存在（不能同时读写）</li><li>同一作用域，特定数据只能有一个可变引用（只能有一个写）</li></ol><p>那这个解决了什么问题呢？并发问题？不是，应该说 数据竞争 问题。</p><h2 id="总结"><a href="#总结" class="headerlink" title="总结"></a>总结</h2><p>为了没有 GC，设计了所有权作为代价；为了解决单一所有权的转移问题，设计了借用规则作为代价。而借用规则的可变不可变，也正是 Rust 设计中 mut 的理念的一种体现吧。所以总的来说，整个设计是环环相扣的剧情，其实并不复杂，理解本质原因，其实就能理解其中的设计了。</p>]]>
    </content>
    
    
      
      
        
        
    <summary type="html">&lt;h2 id=&quot;前言&quot;&gt;&lt;a href=&quot;#前言&quot; class=&quot;headerlink&quot; title=&quot;前言&quot;&gt;&lt;/a&gt;前言&lt;/h2&gt;&lt;p&gt;市面上的 Rust 书已经把基础说的足够好了，我没必要再搬砖，而是尝试从一些个人的角度来重新审视其中的一些设计。Rust</summary>
        
      
    
    
    
    <category term="rust" scheme="https://www.linkinstars.com/categories/rust/"/>
    
    
    <category term="rust" scheme="https://www.linkinstars.com/tags/rust/"/>
    
  </entry>
  
  <entry>
    <title>《一起读 kubernetes 源码》replicaset 到底有何不同</title>
    <link href="https://www.linkinstars.com/post/4b56e50d.html"/>
    <id>https://www.linkinstars.com/post/4b56e50d.html</id>
    <published>2024-07-31T16:00:00.000Z</published>
    <updated>2024-08-29T10:43:26.818Z</updated>
    
    <content type="html">
    <![CDATA[<a href="https://www.linkinstars.com/post/4b56e50d.html">RSS 阅读体验可能不太好，若喜欢本文请点此跳转原文查看~</a><br><br>]]>
    <![CDATA[<blockquote><p>📢 注意，该文本非最终版本，正在更新中，版权所有，请勿转载！！</p></blockquote><h2 id="前言"><a href="#前言" class="headerlink" title="前言"></a>前言</h2><p>前一节我们看到了 deployment 的滚动更新实现，如果你对它已经有一个比较清晰的认识，那么这一节的 replicaset 就非常容易理解了，因为基本实现都是差不多的。为了方便描述后面文中提及的 replicaset 统一简写为 RS。</p><p>在一开始学习 k8s 的时候其实我们不一定能碰到这个对象，如果只是日常的使用通常来说的都是 deployment 或者是 statefulset 这样。渐渐深入才会发现它。好像默默无闻的它是做什么的呢？</p><h2 id="前置知识"><a href="#前置知识" class="headerlink" title="前置知识"></a>前置知识</h2><ul><li>RS 是什么？</li></ul><h2 id="心路历程"><a href="#心路历程" class="headerlink" title="心路历程"></a>心路历程</h2><p>在不知道 RS 之前我一直都以为是 deployment 直接去控制的 pod。而在一开始了解之后，我会好奇为什么要设计一个 RS，直接控制不行吗？渐渐的深入，就会发现，其实它有着自己的设计在里面。</p><h2 id="码前提问"><a href="#码前提问" class="headerlink" title="码前提问"></a>码前提问</h2><ol><li>RS 和 Deployment 关系是什么？</li><li>有何特别的设计？</li></ol><h2 id="源码分析"><a href="#源码分析" class="headerlink" title="源码分析"></a>源码分析</h2><h3 id="寻码过程"><a href="#寻码过程" class="headerlink" title="寻码过程"></a>寻码过程</h3><p>有了 deployment 的经验其实 RS 寻码的过程就非常简单了。关键都是在 <code>控制器</code> 上。于是在相同的包下面我们就容易找到它。</p><figure class="highlight plaintext"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br></pre></td><td class="code"><pre><span class="line">kubernetes/pkg/controller/deployment</span><br><span class="line">kubernetes/pkg/controller/replicaset</span><br></pre></td></tr></table></figure><p>而且我相信有了前面的经验，你已经对这样的对象看源码的过程比较容易上手了，我们会依旧先从：结构、如何 New、如何使用，使用过程中的细节，这几个部分来着手。</p><h3 id="结构"><a href="#结构" class="headerlink" title="结构"></a>结构</h3><figure class="highlight go"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br><span class="line">29</span><br><span class="line">30</span><br><span class="line">31</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment">// pkg/controller/replicaset/replica_set.go:81</span></span><br><span class="line"><span class="keyword">type</span> ReplicaSetController <span class="keyword">struct</span> &#123;</span><br><span class="line">kubeClient clientset.Interface</span><br><span class="line">podControl controller.PodControlInterface</span><br><span class="line">eventBroadcaster record.EventBroadcaster</span><br><span class="line"></span><br><span class="line"><span class="comment">// A ReplicaSet is temporarily suspended after creating/deleting these many replicas.</span></span><br><span class="line"><span class="comment">// It resumes normal action after observing the watch events for them.</span></span><br><span class="line">burstReplicas <span class="type">int</span></span><br><span class="line"><span class="comment">// To allow injection of syncReplicaSet for testing.</span></span><br><span class="line">syncHandler <span class="function"><span class="keyword">func</span><span class="params">(ctx context.Context, rsKey <span class="type">string</span>)</span></span> <span class="type">error</span></span><br><span class="line"></span><br><span class="line"><span class="comment">// A TTLCache of pod creates/deletes each rc expects to see.</span></span><br><span class="line">expectations *controller.UIDTrackingControllerExpectations</span><br><span class="line"></span><br><span class="line"><span class="comment">// A store of ReplicaSets, populated by the shared informer passed to NewReplicaSetController</span></span><br><span class="line">rsLister appslisters.ReplicaSetLister</span><br><span class="line"><span class="comment">// rsListerSynced returns true if the pod store has been synced at least once.</span></span><br><span class="line"><span class="comment">// Added as a member to the struct to allow injection for testing.</span></span><br><span class="line">rsListerSynced cache.InformerSynced</span><br><span class="line">rsIndexer      cache.Indexer</span><br><span class="line"></span><br><span class="line"><span class="comment">// A store of pods, populated by the shared informer passed to NewReplicaSetController</span></span><br><span class="line">podLister corelisters.PodLister</span><br><span class="line"><span class="comment">// podListerSynced returns true if the pod store has been synced at least once.</span></span><br><span class="line"><span class="comment">// Added as a member to the struct to allow injection for testing.</span></span><br><span class="line">podListerSynced cache.InformerSynced</span><br><span class="line"></span><br><span class="line"><span class="comment">// Controllers that need to be synced</span></span><br><span class="line">queue workqueue.RateLimitingInterface</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure><p>有没有一种熟悉的感觉，我第一次看的时候，就觉得和 deployment 几乎一模一样，而那么关键点也是在启动和同步(syncHandler)的的时候了，所以让我们直接往下看。</p><h3 id="创建"><a href="#创建" class="headerlink" title="创建"></a>创建</h3><figure class="highlight go"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br><span class="line">29</span><br><span class="line">30</span><br><span class="line">31</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment">// pkg/controller/replicaset/replica_set.go:138</span></span><br><span class="line"><span class="function"><span class="keyword">func</span> <span class="title">NewBaseController</span><span class="params">(logger klog.Logger, rsInformer appsinformers.ReplicaSetInformer, podInformer coreinformers.PodInformer, kubeClient clientset.Interface, burstReplicas <span class="type">int</span>,</span></span></span><br><span class="line"><span class="params"><span class="function">gvk schema.GroupVersionKind, metricOwnerName, queueName <span class="type">string</span>, podControl controller.PodControlInterface, eventBroadcaster record.EventBroadcaster)</span></span> *ReplicaSetController &#123;</span><br><span class="line"></span><br><span class="line">rsc := &amp;ReplicaSetController&#123;</span><br><span class="line">GroupVersionKind: gvk,</span><br><span class="line">kubeClient:       kubeClient,</span><br><span class="line">podControl:       podControl,</span><br><span class="line">eventBroadcaster: eventBroadcaster,</span><br><span class="line">burstReplicas:    burstReplicas,</span><br><span class="line">expectations:     controller.NewUIDTrackingControllerExpectations(controller.NewControllerExpectations()),</span><br><span class="line">queue:            workqueue.NewNamedRateLimitingQueue(workqueue.DefaultControllerRateLimiter(), queueName),</span><br><span class="line">&#125;</span><br><span class="line"></span><br><span class="line">rsInformer.Informer().AddEventHandler(cache.ResourceEventHandlerFuncs&#123;</span><br><span class="line">AddFunc: <span class="function"><span class="keyword">func</span><span class="params">(obj <span class="keyword">interface</span>&#123;&#125;)</span></span> &#123;</span><br><span class="line">rsc.addRS(logger, obj)</span><br><span class="line">&#125;,</span><br><span class="line">UpdateFunc: <span class="function"><span class="keyword">func</span><span class="params">(oldObj, newObj <span class="keyword">interface</span>&#123;&#125;)</span></span> &#123;</span><br><span class="line">rsc.updateRS(logger, oldObj, newObj)</span><br><span class="line">&#125;,</span><br><span class="line">DeleteFunc: <span class="function"><span class="keyword">func</span><span class="params">(obj <span class="keyword">interface</span>&#123;&#125;)</span></span> &#123;</span><br><span class="line">rsc.deleteRS(logger, obj)</span><br><span class="line">&#125;,</span><br><span class="line">&#125;)</span><br><span class="line"><span class="comment">//.....</span></span><br><span class="line"></span><br><span class="line">rsc.syncHandler = rsc.syncReplicaSet</span><br><span class="line"></span><br><span class="line"><span class="keyword">return</span> rsc</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure><p>还是一样的 Informer 熟悉的配方，只不过这次都换成了 RS 的方法。然后就让我惊奇的发现了下面的路径。</p><p>整个路径就是：<code>Run</code> -&gt; <code>worker</code> -&gt; <code>processNextWorkItem</code> -&gt; <code>syncHandler</code> 。</p><p>这？这不就和 deployment 一模一样了么。_所以其实对于这类对象本身的操作设计和行为都是一致的_。还是那一套事件处理的机制，唯独不一样的是什么呢？让我们回到 deployment 里面看看</p><figure class="highlight go"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br><span class="line">29</span><br><span class="line">30</span><br><span class="line">31</span><br><span class="line">32</span><br><span class="line">33</span><br><span class="line">34</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment">// pkg/controller/deployment/deployment_controller.go:101</span></span><br><span class="line"><span class="function"><span class="keyword">func</span> <span class="title">NewDeploymentController</span><span class="params">(ctx context.Context, dInformer appsinformers.DeploymentInformer, rsInformer appsinformers.ReplicaSetInformer, podInformer coreinformers.PodInformer, client clientset.Interface)</span></span> (*DeploymentController, <span class="type">error</span>) &#123;</span><br><span class="line"><span class="comment">// ....</span></span><br><span class="line">dInformer.Informer().AddEventHandler(cache.ResourceEventHandlerFuncs&#123;</span><br><span class="line">AddFunc: <span class="function"><span class="keyword">func</span><span class="params">(obj <span class="keyword">interface</span>&#123;&#125;)</span></span> &#123;</span><br><span class="line">dc.addDeployment(logger, obj)</span><br><span class="line">&#125;,</span><br><span class="line">UpdateFunc: <span class="function"><span class="keyword">func</span><span class="params">(oldObj, newObj <span class="keyword">interface</span>&#123;&#125;)</span></span> &#123;</span><br><span class="line">dc.updateDeployment(logger, oldObj, newObj)</span><br><span class="line">&#125;,</span><br><span class="line"><span class="comment">// This will enter the sync loop and no-op, because the deployment has been deleted from the store.</span></span><br><span class="line">DeleteFunc: <span class="function"><span class="keyword">func</span><span class="params">(obj <span class="keyword">interface</span>&#123;&#125;)</span></span> &#123;</span><br><span class="line">dc.deleteDeployment(logger, obj)</span><br><span class="line">&#125;,</span><br><span class="line">&#125;)</span><br><span class="line">rsInformer.Informer().AddEventHandler(cache.ResourceEventHandlerFuncs&#123;</span><br><span class="line">AddFunc: <span class="function"><span class="keyword">func</span><span class="params">(obj <span class="keyword">interface</span>&#123;&#125;)</span></span> &#123;</span><br><span class="line">dc.addReplicaSet(logger, obj)</span><br><span class="line">&#125;,</span><br><span class="line">UpdateFunc: <span class="function"><span class="keyword">func</span><span class="params">(oldObj, newObj <span class="keyword">interface</span>&#123;&#125;)</span></span> &#123;</span><br><span class="line">dc.updateReplicaSet(logger, oldObj, newObj)</span><br><span class="line">&#125;,</span><br><span class="line">DeleteFunc: <span class="function"><span class="keyword">func</span><span class="params">(obj <span class="keyword">interface</span>&#123;&#125;)</span></span> &#123;</span><br><span class="line">dc.deleteReplicaSet(logger, obj)</span><br><span class="line">&#125;,</span><br><span class="line">&#125;)</span><br><span class="line">podInformer.Informer().AddEventHandler(cache.ResourceEventHandlerFuncs&#123;</span><br><span class="line">DeleteFunc: <span class="function"><span class="keyword">func</span><span class="params">(obj <span class="keyword">interface</span>&#123;&#125;)</span></span> &#123;</span><br><span class="line">dc.deletePod(logger, obj)</span><br><span class="line">&#125;,</span><br><span class="line">&#125;)</span><br><span class="line"><span class="comment">// ....</span></span><br><span class="line"><span class="keyword">return</span> dc, <span class="literal">nil</span></span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure><p>之前我们没有发现，原来 <code>rsInformer</code> 就在里面，其实对于 RS 的事件 Deployment 也处理了，那么其实很容易理解了，其实最终控制 pod 的是 RS，而 Deployment 控制 RS ，RS 专心管 pod，而 Deployment 其实额外提供了之前说的升级和回滚等等利于实际应用升级部署的操作。所以我们上一节中更多关注在了 Deployment 本身的更新动作上，而 pod 其实还没仔细看。这里 RS 补充了这一部分。</p><p>于是说我就将源码的重心移动到了如何控制 pod 上面，于是我很快发现了 <code>CreatePods</code> 方法被调用的地方。</p><figure class="highlight go"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br><span class="line">29</span><br><span class="line">30</span><br><span class="line">31</span><br><span class="line">32</span><br><span class="line">33</span><br><span class="line">34</span><br><span class="line">35</span><br><span class="line">36</span><br><span class="line">37</span><br><span class="line">38</span><br><span class="line">39</span><br><span class="line">40</span><br><span class="line">41</span><br><span class="line">42</span><br><span class="line">43</span><br><span class="line">44</span><br><span class="line">45</span><br><span class="line">46</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment">// pkg/controller/replicaset/replica_set.go:566</span></span><br><span class="line"><span class="function"><span class="keyword">func</span> <span class="params">(rsc *ReplicaSetController)</span></span> manageReplicas(ctx context.Context, filteredPods []*v1.Pod, rs *apps.ReplicaSet) <span class="type">error</span> &#123;</span><br><span class="line">diff := <span class="built_in">len</span>(filteredPods) - <span class="type">int</span>(*(rs.Spec.Replicas))</span><br><span class="line">rsKey, err := controller.KeyFunc(rs)</span><br><span class="line"><span class="keyword">if</span> err != <span class="literal">nil</span> &#123;</span><br><span class="line">utilruntime.HandleError(fmt.Errorf(<span class="string">&quot;couldn&#x27;t get key for %v %#v: %v&quot;</span>, rsc.Kind, rs, err))</span><br><span class="line"><span class="keyword">return</span> <span class="literal">nil</span></span><br><span class="line">&#125;</span><br><span class="line">logger := klog.FromContext(ctx)</span><br><span class="line"><span class="keyword">if</span> diff &lt; <span class="number">0</span> &#123;</span><br><span class="line">diff *= <span class="number">-1</span></span><br><span class="line"><span class="keyword">if</span> diff &gt; rsc.burstReplicas &#123;</span><br><span class="line">diff = rsc.burstReplicas</span><br><span class="line">&#125;</span><br><span class="line"></span><br><span class="line">rsc.expectations.ExpectCreations(logger, rsKey, diff)</span><br><span class="line">logger.V(<span class="number">2</span>).Info(<span class="string">&quot;Too few replicas&quot;</span>, <span class="string">&quot;replicaSet&quot;</span>, klog.KObj(rs), <span class="string">&quot;need&quot;</span>, *(rs.Spec.Replicas), <span class="string">&quot;creating&quot;</span>, diff)</span><br><span class="line"></span><br><span class="line">successfulCreations, err := slowStartBatch(diff, controller.SlowStartInitialBatchSize, <span class="function"><span class="keyword">func</span><span class="params">()</span></span> <span class="type">error</span> &#123;</span><br><span class="line">err := rsc.podControl.CreatePods(ctx, rs.Namespace, &amp;rs.Spec.Template, rs, metav1.NewControllerRef(rs, rsc.GroupVersionKind))</span><br><span class="line"><span class="keyword">if</span> err != <span class="literal">nil</span> &#123;</span><br><span class="line"><span class="keyword">if</span> apierrors.HasStatusCause(err, v1.NamespaceTerminatingCause) &#123;</span><br><span class="line"><span class="comment">// if the namespace is being terminated, we don&#x27;t have to do</span></span><br><span class="line"><span class="comment">// anything because any creation will fail</span></span><br><span class="line"><span class="keyword">return</span> <span class="literal">nil</span></span><br><span class="line">&#125;</span><br><span class="line">&#125;</span><br><span class="line"><span class="keyword">return</span> err</span><br><span class="line">&#125;)</span><br><span class="line"></span><br><span class="line"><span class="comment">// Any skipped pods that we never attempted to start shouldn&#x27;t be expected.</span></span><br><span class="line"><span class="comment">// The skipped pods will be retried later. The next controller resync will</span></span><br><span class="line"><span class="comment">// retry the slow start process.</span></span><br><span class="line"><span class="keyword">if</span> skippedPods := diff - successfulCreations; skippedPods &gt; <span class="number">0</span> &#123;</span><br><span class="line">logger.V(<span class="number">2</span>).Info(<span class="string">&quot;Slow-start failure. Skipping creation of pods, decrementing expectations&quot;</span>, <span class="string">&quot;podsSkipped&quot;</span>, skippedPods, <span class="string">&quot;kind&quot;</span>, rsc.Kind, <span class="string">&quot;replicaSet&quot;</span>, klog.KObj(rs))</span><br><span class="line"><span class="keyword">for</span> i := <span class="number">0</span>; i &lt; skippedPods; i++ &#123;</span><br><span class="line"><span class="comment">// Decrement the expected number of creates because the informer won&#x27;t observe this pod</span></span><br><span class="line">rsc.expectations.CreationObserved(logger, rsKey)</span><br><span class="line">&#125;</span><br><span class="line">&#125;</span><br><span class="line"><span class="keyword">return</span> err</span><br><span class="line">&#125;</span><br><span class="line"></span><br><span class="line"><span class="comment">//...</span></span><br><span class="line"><span class="keyword">return</span> <span class="literal">nil</span></span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure><p>其实这里的大方向的逻辑非常简单，就是根据 <code>diff</code> 的不同来对 pod 进行不同的处理，如果 <code>diff &lt; 0</code> 就会去创建，而 <code>diff &gt; 0</code> 就会去删除。删除的代码这里就不做展示了。</p><p>这里也是一个和之前函数式调用类似的方式调用 <code>slowStartBatch</code> 方法的时候将 <code>fn</code> 传递了进去，<code>fn</code> 其中就就是具体创建的方法，而 <code>slowStartBatch</code> 又是一个很不错可以学习的设计。这个我们放在后面一起说。而在这之前有一个重要步骤是 <code>expectations</code> 的创建</p><figure class="highlight go"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line">rsc.expectations.ExpectCreations(logger, rsKey, diff)</span><br></pre></td></tr></table></figure><p>那么其实 expectations 里面记录的就是对于 rsKEY 的 diff，也就是改变量，让我们具体来看看这个里面是什么。</p><figure class="highlight go"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment">// pkg/controller/controller_utils.go:332</span></span><br><span class="line"><span class="keyword">type</span> UIDTrackingControllerExpectations <span class="keyword">struct</span> &#123;</span><br><span class="line">ControllerExpectationsInterface</span><br><span class="line"><span class="comment">// <span class="doctag">TODO:</span> There is a much nicer way to do this that involves a single store,</span></span><br><span class="line"><span class="comment">// a lock per entry, and a ControlleeExpectationsInterface type.</span></span><br><span class="line">uidStoreLock sync.Mutex</span><br><span class="line"><span class="comment">// Store used for the UIDs associated with any expectation tracked via the</span></span><br><span class="line"><span class="comment">// ControllerExpectationsInterface.</span></span><br><span class="line">uidStore cache.Store</span><br><span class="line">&#125;</span><br><span class="line"></span><br><span class="line"><span class="comment">// pkg/controller/controller_utils.go:147</span></span><br><span class="line"><span class="keyword">type</span> ControllerExpectationsInterface <span class="keyword">interface</span> &#123;</span><br><span class="line">GetExpectations(controllerKey <span class="type">string</span>) (*ControlleeExpectations, <span class="type">bool</span>, <span class="type">error</span>)</span><br><span class="line">SatisfiedExpectations(logger klog.Logger, controllerKey <span class="type">string</span>) <span class="type">bool</span></span><br><span class="line">DeleteExpectations(logger klog.Logger, controllerKey <span class="type">string</span>)</span><br><span class="line">SetExpectations(logger klog.Logger, controllerKey <span class="type">string</span>, add, del <span class="type">int</span>) <span class="type">error</span></span><br><span class="line">ExpectCreations(logger klog.Logger, controllerKey <span class="type">string</span>, adds <span class="type">int</span>) <span class="type">error</span></span><br><span class="line">ExpectDeletions(logger klog.Logger, controllerKey <span class="type">string</span>, dels <span class="type">int</span>) <span class="type">error</span></span><br><span class="line">CreationObserved(logger klog.Logger, controllerKey <span class="type">string</span>)</span><br><span class="line">DeletionObserved(logger klog.Logger, controllerKey <span class="type">string</span>)</span><br><span class="line">RaiseExpectations(logger klog.Logger, controllerKey <span class="type">string</span>, add, del <span class="type">int</span>)</span><br><span class="line">LowerExpectations(logger klog.Logger, controllerKey <span class="type">string</span>, add, del <span class="type">int</span>)</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure><p>上面的接口，而具体的实现是 <code>ControllerExpectations</code> 来实现的。它的结构非常简单其本质是 <code>cache.Store</code> 也就是一个泛型缓存，当然在那个年代里面还没有泛型，就是 interface 而已。</p><figure class="highlight go"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment">// pkg/controller/controller_utils.go:265</span></span><br><span class="line"><span class="keyword">type</span> ControlleeExpectations <span class="keyword">struct</span> &#123;</span><br><span class="line">add       <span class="type">int64</span></span><br><span class="line">del       <span class="type">int64</span></span><br><span class="line">key       <span class="type">string</span></span><br><span class="line">timestamp time.Time</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure><p>看到结构就容易理解了，其实内部存放的就是 add 和 del 的数量也就是期望的改变量。</p><figure class="highlight go"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment">// pkg/controller/controller_utils.go:281</span></span><br><span class="line"><span class="function"><span class="keyword">func</span> <span class="params">(e *ControlleeExpectations)</span></span> Fulfilled() <span class="type">bool</span> &#123;</span><br><span class="line"><span class="keyword">return</span> atomic.LoadInt64(&amp;e.add) &lt;= <span class="number">0</span> &amp;&amp; atomic.LoadInt64(&amp;e.del) &lt;= <span class="number">0</span></span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure><p>当二者都为 0 时则实际 &#x3D;&#x3D; 预期，也就是我们在第一节提到的控制循环类似的思路。而其中还设计了一个过期时间，从而包装了整个逻辑，包括改变、校验等等。虽然感觉实际(工作中)用到的情况会比较少，但是这部分的代码依旧可以做一个案例参考，毕竟里面还是有一些并发操作的。而且这部分应该是还有优化的空间。</p><h2 id="码后解答"><a href="#码后解答" class="headerlink" title="码后解答"></a>码后解答</h2><ol><li>RS 和 Deployment 关系是什么？<ol><li>Deployment 控制 RS ，RS 控制 Pod</li></ol></li><li>有何特别的设计？<ol><li>那当然是 expectations 的设计</li></ol></li></ol><h2 id="额外扩展"><a href="#额外扩展" class="headerlink" title="额外扩展"></a>额外扩展</h2><p>让我们回头来看看 <code>slowStartBatch</code> 的实现部分吧。从翻译上来看应该叫作批量慢启动。</p><figure class="highlight go"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br></pre></td><td class="code"><pre><span class="line"><span class="function"><span class="keyword">func</span> <span class="title">slowStartBatch</span><span class="params">(count <span class="type">int</span>, initialBatchSize <span class="type">int</span>, fn <span class="keyword">func</span>()</span></span> <span class="type">error</span>) (<span class="type">int</span>, <span class="type">error</span>) &#123;</span><br><span class="line">remaining := count</span><br><span class="line">successes := <span class="number">0</span></span><br><span class="line"><span class="keyword">for</span> batchSize := integer.IntMin(remaining, initialBatchSize); batchSize &gt; <span class="number">0</span>; batchSize = integer.IntMin(<span class="number">2</span>*batchSize, remaining) &#123;</span><br><span class="line">errCh := <span class="built_in">make</span>(<span class="keyword">chan</span> <span class="type">error</span>, batchSize)</span><br><span class="line"><span class="keyword">var</span> wg sync.WaitGroup</span><br><span class="line">wg.Add(batchSize)</span><br><span class="line"><span class="keyword">for</span> i := <span class="number">0</span>; i &lt; batchSize; i++ &#123;</span><br><span class="line"><span class="keyword">go</span> <span class="function"><span class="keyword">func</span><span class="params">()</span></span> &#123;</span><br><span class="line"><span class="keyword">defer</span> wg.Done()</span><br><span class="line"><span class="keyword">if</span> err := fn(); err != <span class="literal">nil</span> &#123;</span><br><span class="line">errCh &lt;- err</span><br><span class="line">&#125;</span><br><span class="line">&#125;()</span><br><span class="line">&#125;</span><br><span class="line">wg.Wait()</span><br><span class="line">curSuccesses := batchSize - <span class="built_in">len</span>(errCh)</span><br><span class="line">successes += curSuccesses</span><br><span class="line"><span class="keyword">if</span> <span class="built_in">len</span>(errCh) &gt; <span class="number">0</span> &#123;</span><br><span class="line"><span class="keyword">return</span> successes, &lt;-errCh</span><br><span class="line">&#125;</span><br><span class="line">remaining -= batchSize</span><br><span class="line">&#125;</span><br><span class="line"><span class="keyword">return</span> successes, <span class="literal">nil</span></span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure><p>看起来很复杂，其实很简单。</p><ol><li><code>initialBatchSize</code> 从 1 开始，<code>batchSize</code> 就是从 1 开始(当然和 remaining 相比)，每次批次是 <code>x2</code> 倍的递增，也就是 1 2 4 8 这样</li><li>每个执行批次通过 <code>wg</code> 控制并发</li><li>也就是每次只要前面一直成功并发就会越来越快</li></ol><p>这个部分的设计几乎可以直接抄过来用的，这个 TCP 的拥塞控制还不一样，由于总工作量是有限制的，所以到后面不会并发爆发，也就不需要拥塞避免的部分，小体量的任务场景下好用。</p><h2 id="总结提升"><a href="#总结提升" class="headerlink" title="总结提升"></a>总结提升</h2><p>其实当我们看完源码，回过头来审视这两个对象的名称，<code>Deployment</code> 和 <code>ReplicaSet</code>，我们就会发现，其实它们的名字就是它们的职责，Deployment 是部署，ReplicaSet 是复制集，而复制集就是去控制 pod 的复制，而部署就是去控制复制集的部署。只不过可能我们对于英文的不敏感，所以一开始不会这么直观的感受到。</p><h3 id="设计上"><a href="#设计上" class="headerlink" title="设计上"></a>设计上</h3><p>从设计的角度而言，我觉得我们能学到的是有关与对象与职责的设计。我们可以看到 Deployment 并没有直接去控制 pod，而 ReplicaSet 去控制了 pod ，所以 ReplicaSet 的职责就是去管理 pod，而 Deployment 的职责是什么呢？其实是应用的生命周期，Deployment 允许你定义升级策略、回滚操作、滚动更新等功能，使得在应用程序更新时能够更加方便和可控，当你需要更新应用程序时，可以通过更新 Deployment 的来触发新的 ReplicaSet 的创建，然后逐步替换旧的 ReplicaSet 中的 Pod。</p><h3 id="编码上"><a href="#编码上" class="headerlink" title="编码上"></a>编码上</h3><p>在编码上，我们可以学到两点，前面也提到了。一个是有关 <code>expectations</code> 的包装和设计，一个是有关 <code>slowStartBatch</code> 对于慢启动函数的设计，这些都是可以被我们学习和利用的。</p>]]>
    </content>
    
    
      
      
        
        
    <summary type="html">&lt;blockquote&gt;
&lt;p&gt;📢 注意，该文本非最终版本，正在更新中，版权所有，请勿转载！！&lt;/p&gt;
&lt;/blockquote&gt;
&lt;h2 id=&quot;前言&quot;&gt;&lt;a href=&quot;#前言&quot; class=&quot;headerlink&quot;</summary>
        
      
    
    
    
    <category term="kubernetes" scheme="https://www.linkinstars.com/categories/kubernetes/"/>
    
    
    <category term="kubernetes" scheme="https://www.linkinstars.com/tags/kubernetes/"/>
    
  </entry>
  
  <entry>
    <title>Go 应用容器下优雅停止坑点</title>
    <link href="https://www.linkinstars.com/post/2c762cec.html"/>
    <id>https://www.linkinstars.com/post/2c762cec.html</id>
    <published>2024-07-14T16:00:00.000Z</published>
    <updated>2024-08-19T06:55:05.675Z</updated>
    
    <content type="html">
    <![CDATA[<a href="https://www.linkinstars.com/post/2c762cec.html">RSS 阅读体验可能不太好，若喜欢本文请点此跳转原文查看~</a><br><br>]]>
    <![CDATA[<h2 id="前言"><a href="#前言" class="headerlink" title="前言"></a>前言</h2><p>之前我有写过 <a href="https://www.linkinstars.com/post/12666403.html">go 应用在 k8s 中如何优雅停止</a> 的博客，理论上在配置好对应的参数之后就能 优雅停止 了，但是最近接触到了两个场景，会导致配置的优雅停止失效，为了避免踩坑，对于之前的博客进一步进行补充。</p><h2 id="场景说明"><a href="#场景说明" class="headerlink" title="场景说明"></a>场景说明</h2><p>有了之前的经验，Golang 应用本身没有问题，它已经接受并处理 <code>SIGTERM</code> 和 <code>SIGINT</code> 信号，但是实际场景出现的情况，在 k8s 或者 docker 停止的时候 <strong>有一些缓慢</strong> ，但是由于最终容器还是会被关闭，于是这个问题就没有关注，这个现象也很容易被忽略。</p><figure class="highlight go"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br></pre></td><td class="code"><pre><span class="line"><span class="keyword">package</span> main</span><br><span class="line"></span><br><span class="line"><span class="keyword">import</span> (</span><br><span class="line"><span class="string">&quot;fmt&quot;</span></span><br><span class="line"><span class="string">&quot;os&quot;</span></span><br><span class="line"><span class="string">&quot;os/signal&quot;</span></span><br><span class="line"><span class="string">&quot;syscall&quot;</span></span><br><span class="line">)</span><br><span class="line"></span><br><span class="line"><span class="function"><span class="keyword">func</span> <span class="title">main</span><span class="params">()</span></span> &#123;</span><br><span class="line">fmt.Println(<span class="string">&quot;启动&quot;</span>)</span><br><span class="line">ch := <span class="built_in">make</span>(<span class="keyword">chan</span> os.Signal, <span class="number">1</span>)</span><br><span class="line">signal.Notify(ch, syscall.SIGTERM, syscall.SIGINT)</span><br><span class="line">s := &lt;-ch</span><br><span class="line"><span class="keyword">switch</span> &#123;</span><br><span class="line"><span class="keyword">case</span> s == syscall.SIGINT:</span><br><span class="line">fmt.Println(<span class="string">&quot;收到 SIGINT 信号!&quot;</span>)</span><br><span class="line"><span class="keyword">case</span> s == syscall.SIGTERM:</span><br><span class="line">fmt.Println(<span class="string">&quot;收到 SIGTERM 信号!&quot;</span>)</span><br><span class="line">&#125;</span><br><span class="line">fmt.Println(<span class="string">&quot;退出&quot;</span>)</span><br><span class="line">&#125;</span><br><span class="line"></span><br></pre></td></tr></table></figure><h2 id="场景-1"><a href="#场景-1" class="headerlink" title="场景 1"></a>场景 1</h2><p>这个场景非常简单，也是容易被使用到的一个场景</p><p>Dockerfile 是这样的</p><figure class="highlight dockerfile"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br></pre></td><td class="code"><pre><span class="line"><span class="keyword">FROM</span> alpine</span><br><span class="line"><span class="keyword">ADD</span><span class="language-bash"> app /app</span></span><br><span class="line"><span class="keyword">ADD</span><span class="language-bash"> entrypoint.sh /entrypoint.sh</span></span><br><span class="line"><span class="keyword">ENTRYPOINT</span><span class="language-bash"> [<span class="string">&quot;/entrypoint.sh&quot;</span>]</span></span><br></pre></td></tr></table></figure><p><code>entrypoint.sh</code> 是这样的</p><figure class="highlight bash"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br></pre></td><td class="code"><pre><span class="line"><span class="meta">#!/bin/sh</span></span><br><span class="line">/app</span><br></pre></td></tr></table></figure><p>这里是做了一定的抽象，由于这个入口脚步这个部分可能包含一些实际初始化的工作，这部分工作可能是程序没办法处理的环境等等问题，有兴趣的同学可以按下面的步骤测试一下。</p><figure class="highlight bash"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br></pre></td><td class="code"><pre><span class="line">GOOS=linux GOARCH=amd64 go build -o app .</span><br><span class="line">docker build -t star .</span><br><span class="line">docker run --name star star</span><br></pre></td></tr></table></figure><p>启动之后你就会发现一个问题，<code>Ctrl+c</code> 是没办法关闭的，执行 <code>docker stop star</code> 之后需要一段时间才会关闭，并且关闭之前没有任何信号相关的日志信息。</p><h3 id="问题原因"><a href="#问题原因" class="headerlink" title="问题原因"></a>问题原因</h3><p>这个场景出现问题的原因很简单，就是因为我们运行的方式是以脚步的方式运行的，主进程并不是业务的 app 而是 shell。而关闭时 <code>SIGTERM</code> 信号会发给 shell ，但是 shell 是不会把信号给你的。我们可以进入容器 ps 一下马上就清楚了。</p><figure class="highlight bash"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br></pre></td><td class="code"><pre><span class="line">docker <span class="built_in">exec</span> -it star sh</span><br><span class="line">/ <span class="comment"># ps</span></span><br><span class="line">PID   USER     TIME  COMMAND</span><br><span class="line">    1 root      0:00 &#123;entrypoint.sh&#125; /bin/sh /entrypoint.sh</span><br><span class="line">    7 root      0:00 &#123;app&#125; [rosetta] /app /app</span><br><span class="line">   14 root      0:00 sh</span><br><span class="line">   20 root      0:00 ps</span><br></pre></td></tr></table></figure><h3 id="解决"><a href="#解决" class="headerlink" title="解决"></a>解决</h3><p>这个场景的解决方式非常简单，只需要修改一下脚步就可以了</p><figure class="highlight bash"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br></pre></td><td class="code"><pre><span class="line"><span class="meta">#!/bin/sh</span></span><br><span class="line"><span class="built_in">exec</span> /app</span><br></pre></td></tr></table></figure><p>使用 exec 让新启动的 app 作为主进程就可以</p><figure class="highlight bash"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br></pre></td><td class="code"><pre><span class="line">docker <span class="built_in">exec</span> -it star sh</span><br><span class="line">/ <span class="comment"># ps</span></span><br><span class="line">PID   USER     TIME  COMMAND</span><br><span class="line">    1 root      0:00 &#123;app&#125; [rosetta] /app /app</span><br><span class="line">   12 root      0:00 sh</span><br><span class="line">   18 root      0:00 ps</span><br></pre></td></tr></table></figure><h2 id="场景-2"><a href="#场景-2" class="headerlink" title="场景 2"></a>场景 2</h2><p>这个场景是，当我们的一个容器有多个进程的时候，入口脚步可能是这样的（这里是用同一个二进制模拟，实际场景可能是多个不同应用）</p><figure class="highlight bash"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br></pre></td><td class="code"><pre><span class="line"><span class="meta">#!/bin/sh</span></span><br><span class="line">/app &amp;</span><br><span class="line">/app</span><br></pre></td></tr></table></figure><p>我们没办法同时让两个进程都成为主进程，这个时候就要找外援帮忙了，<code>dumb-init</code> 就是一个不错的选择</p><figure class="highlight dockerfile"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br></pre></td><td class="code"><pre><span class="line"><span class="keyword">FROM</span> alpine</span><br><span class="line"><span class="keyword">RUN</span><span class="language-bash"> apk add --no-cache dumb-init</span></span><br><span class="line"><span class="keyword">ADD</span><span class="language-bash"> entrypoint.sh /entrypoint.sh</span></span><br><span class="line"><span class="keyword">ADD</span><span class="language-bash"> app /app</span></span><br><span class="line"><span class="keyword">ENTRYPOINT</span><span class="language-bash"> [<span class="string">&quot;/usr/bin/dumb-init&quot;</span>, <span class="string">&quot;--&quot;</span>]</span></span><br><span class="line"><span class="keyword">CMD</span><span class="language-bash"> [<span class="string">&quot;/entrypoint.sh&quot;</span>]</span></span><br></pre></td></tr></table></figure><figure class="highlight bash"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br></pre></td><td class="code"><pre><span class="line"><span class="meta">#!/usr/bin/dumb-init /bin/sh</span></span><br><span class="line">/app &amp;</span><br><span class="line">/app</span><br></pre></td></tr></table></figure><p>同时 <code>dumb-init</code> 可以很容易的帮助我们实现信号的传递工作，以它作为主进程，以管理我们的应用子进程。</p><figure class="highlight plaintext"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br></pre></td><td class="code"><pre><span class="line">启动</span><br><span class="line">启动</span><br><span class="line">^C收到 SIGINT 信号!</span><br><span class="line">退出</span><br><span class="line">收到 SIGINT 信号!</span><br><span class="line">退出</span><br></pre></td></tr></table></figure><h2 id="总结"><a href="#总结" class="headerlink" title="总结"></a>总结</h2><p>当然实际的项目中如果没有特别的需求，还是建议直接启动，而并非使用脚本，一旦使用脚本就需要注意信号和进程的特殊情况。并且，一个应用建议一个容器，这样可以避免很多问题。</p>]]>
    </content>
    
    
      
      
        
        
    <summary type="html">&lt;h2 id=&quot;前言&quot;&gt;&lt;a href=&quot;#前言&quot; class=&quot;headerlink&quot; title=&quot;前言&quot;&gt;&lt;/a&gt;前言&lt;/h2&gt;&lt;p&gt;之前我有写过 &lt;a href=&quot;https://www.linkinstars.com/post/12666403.html&quot;&gt;go</summary>
        
      
    
    
    
    <category term="architecture" scheme="https://www.linkinstars.com/categories/architecture/"/>
    
    
    <category term="graceful-shutdown" scheme="https://www.linkinstars.com/tags/graceful-shutdown/"/>
    
  </entry>
  
  <entry>
    <title>程序员减肥记</title>
    <link href="https://www.linkinstars.com/post/ac644e2.html"/>
    <id>https://www.linkinstars.com/post/ac644e2.html</id>
    <published>2024-06-30T16:00:00.000Z</published>
    <updated>2024-07-31T06:49:24.810Z</updated>
    
    <content type="html">
    <![CDATA[<a href="https://www.linkinstars.com/post/ac644e2.html">RSS 阅读体验可能不太好，若喜欢本文请点此跳转原文查看~</a><br><br>]]>
    <![CDATA[<blockquote><p>最近很长一段时间没有写文章了，因为这一段时间在做个“大事”，减肥。运动本身还是挺花时间的，所以确实就没时间更新那种非常干货的博客了。</p></blockquote><h2 id="起因"><a href="#起因" class="headerlink" title="起因"></a>起因</h2><p>原因是体检发现 BMI 超标，<strong>到了接近 25</strong>，由于本人之前还没有到达过这样的体重，确实是有大肚子了。于是毅然开始了减肥之路。整个过程我发现，其实对于我这样小基数的情况来说，其实减肥并不难，科学减肥也没有那么多的痛苦。</p><h2 id="结果"><a href="#结果" class="headerlink" title="结果"></a>结果</h2><p>先说结果，一共是 3 个月(4,5,6)时间，减掉了 10 kg，从 74kg 减到了 64kg。基本上是每个月减 3kg 左右。</p><p><img src="https://blog.linkinstars.com/blog/programmer-lose-weight-line-chart.png" alt="programmer-lose-weight-line-chart"></p><p>从结果来看我自己是非常满意的，下面分享一下整个过程中的一些心得，希望对你有所帮助。之前大学的时候也有过减肥的经历，不过没有这次那么多。</p><p>我先说一些情况：</p><ol><li>我是小基数，对于大基数的减肥来说估计是不适用的</li><li>我是程序员，时间其实对于我来说并不宽裕，如何有效利用时间是我最大的问题</li><li>我仅仅只是想让我的体重回到一个正常的水平上，不是追求肌肉或者身材</li></ol><p>所以在这样的条件下，如何做到 理科(理性+科学) 减肥呢？</p><h2 id="理科减肥"><a href="#理科减肥" class="headerlink" title="理科减肥"></a>理科减肥</h2><p>这里说明一下我的理念和方法，再说说心得，方法不一定对任何人有效，但心得希望能帮助到你。</p><h3 id="科学"><a href="#科学" class="headerlink" title="科学"></a>科学</h3><p>首先，我相信科学说的 “能量守恒”，所以减肥的理念很简单，也就是各大博主说的制造 <strong>热量缺口</strong> ，让你的 <strong>输出 &gt; 输入</strong>。只要你每天消耗的更多，那一定会瘦下来的，这里不再赘述。</p><blockquote><p>那么根据这个理念可以推导出什么呢？</p></blockquote><p>如果你现在正常吃喝，什么都不改变，你的体重是不变的，证明你每天的 输入 ≈ 输出 ，所以如果你想减肥，那么很简单，<strong>即使在你吃不改变的情况下，增加运动也是能减肥的</strong>。这也是我开头一个月所做的，开头一个月我并没有一上来就控制严格的饮食，该吃吃，该喝喝，只是增加了运动。</p><h3 id="理性"><a href="#理性" class="headerlink" title="理性"></a>理性</h3><p>然后，我说说理性，市面上的减肥方法太多了，五花八门，你要学会分辨什么是有用的，什么是谣言。理性的看待减肥，不要盲目跟风。</p><h3 id="迈开腿"><a href="#迈开腿" class="headerlink" title="迈开腿"></a>迈开腿</h3><ol><li>跑步：整个过程我就跑了 3 次 5 公里</li><li>有氧拳击：Switch 的游戏，前面几乎每天都会做 30 分钟的有氧拳击，后面会降低到每周 3-4 次</li></ol><p>没了，没想到把，就这么简单。</p><p>说说原因：跑步是为了启动身体的机能，由于我长时间没有锻炼了，如果想让身体快速适应节奏，那么必须用一个慢速提升且强度足够的运动。跑完之后我的身体明显就会出现全身发热，并且风吹之后痒的不行，还有骨骼肌战栗的那种感觉。</p><p>而选择有氧拳击，一方面是因为没有场地环境限制，在家就行，不用担心下雨什么的，一方面效果也确实不错，激励的效果让我也能坚持（热辣滚烫选择拳击并不仅仅只是电影需求是真的累的）。还有一点是不太伤害膝盖。</p><p><img src="https://blog.linkinstars.com/blog/programmer-lose-weight-boxing.png" alt="programmer-lose-weight-boxing"></p><blockquote><p>网上有人说有氧拳击不够累，我觉得有两点：一个是你打的拳不够用力，你每个都软绵绵的那肯定不行，另一个是你没有脚步，拳击并不仅仅只是打拳，跟着音乐重心不断前后调整的脚步也是非常重要的</p></blockquote><h3 id="管住嘴"><a href="#管住嘴" class="headerlink" title="管住嘴"></a>管住嘴</h3><p>三个月我一共分了三个阶段：</p><ol><li><strong>减少额外的食物</strong>，比如零食、饮料（在这个阶段我最多只喝美式）。</li><li>开始<strong>寻找“干净”的食物</strong>吃，会避免过多的碳水摄入，这个阶段是最饿的。</li><li>逐渐<strong>恢复正常饮食</strong>，开始逐步恢复碳水。</li></ol><p>我解释一下我所谓的 “干净” 食物。由于我们平常上班，绝大多数公司是没有食堂的，所以外卖和楼下就成为了首选，这样会导致问题就是，油不对。在我看来，糖和油是最容易让你摄入过多热量东西，并且油会更可怕，因为不干净的油可能会影响你的代谢。所以我会尽量会避免摄入带有不确定油的食物，如一些外卖的炒菜什么的，如果要油我会选择那些知名一点的。下面就是我强烈推荐的食物来源。</p><p>还要解释的一点是，为什么第三阶段我要恢复，因为在第三阶段我马上就要到达我的目标了，而在我到达目标之后，我不想继续一直保持那种饮食，因为并且我不想反弹，所以我会逐渐恢复，让我的身体适应。如果马上出现反弹的情况，那么我还会继续我的减肥计划，总之是想让身体适应。</p><h4 id="我推荐的食物"><a href="#我推荐的食物" class="headerlink" title="我推荐的食物"></a>我推荐的食物</h4><ul><li><strong>沙县小吃</strong>：这个绝对是减脂期间的最大帮手，而且价格不贵，你只要找一家相对干净的就没问题。鸭腿饭，你饭就吃两口，其他的青菜、鸡蛋、鸭腿都是妥妥的干净食物。</li><li><strong>煮的</strong>：如果一个食物是煮的，那么相对于炒的，那肯定好一些。比如麻辣烫(食材你自己把控，你要吃丸子那就完蛋)，比如清汤面，其实一些汤面我觉得是完全可以吃的，只要你别把面吃完。</li><li><strong>麦当劳&#x2F;肯德基</strong>：有人就会说，你减肥吃麦当劳？吃肯德基？没错，我的理由有两个，一个是这二者都提供了热量参数，你很容易知道哪些能吃，哪些不能吃，另一个是他们的油是相对干净的，相较于其他，我更喜欢这两家。</li><li><strong>赛百味</strong>：这个绝对是我个人的最佳，一方面是因为他们提供了一个标准的健康餐的参数：<strong>全麦面包+蔬菜+牛肉</strong> 或者 鸡肉，妥妥的 <strong>碳水+蛋白质+蔬菜</strong> 完美搭配，前期我会选择不加酱料，后期恢复会加一点。</li></ul><p>注意，抛开质量谈热量都是耍流氓，比如有的酱料热量确实很高，但是你就吃 5g 也大可不必视为魔鬼。</p><h2 id="我的心得"><a href="#我的心得" class="headerlink" title="我的心得"></a>我的心得</h2><p>下面的心得是我在这个过程中的一些感悟，希望对你有所帮助。</p><h3 id="30-分钟才开始燃脂"><a href="#30-分钟才开始燃脂" class="headerlink" title="30 分钟才开始燃脂"></a>30 分钟才开始燃脂</h3><p>这个谣言到这里能不能停！很多地方说运动 30 分钟才开始燃脂。理科的我们都知道，你只要呼吸，你 ATP 就得消耗供能，你只要运动，你就得消耗热量。你管它消耗的热量哪里来，糖原？脂肪？肝脏，你别管，你只要知道，你动肯定比不动要强。</p><h3 id="不要计算热量"><a href="#不要计算热量" class="headerlink" title="不要计算热量"></a>不要计算热量</h3><p>最早大学时期减肥的时候我也算，“薄荷” 对吧，我也记录，但是这次减肥我不算了。我给出的两个理由是：</p><ol><li>热量一定是算不准的，你的基础代谢在不断变化，而你吃的东西吸收率(我不知道有没有这个概念，你就当我杜撰的)也不一样(我不信每个人吃进去多少就吸收多少)，所以你算的热量是不准的。与其算这个，不如看<strong>体重变化趋势</strong>。</li><li>麻烦，我懒</li></ol><h3 id="不过于关注体重"><a href="#不过于关注体重" class="headerlink" title="不过于关注体重"></a>不过于关注体重</h3><p>你要关注的不是体重数字本身，而是体重的变化趋势。如果一周的趋势是下降的，那么没问题，如果，一周的趋势是升高的，那就要注意了，你要调整了。为什么呢？体重的短期波动太容易了，前一天少喝水，后一天马上轻 1kg 是常有的事情。所以短期的体重波动参考意义不大。</p><h3 id="“多动症”"><a href="#“多动症”" class="headerlink" title="“多动症”"></a>“多动症”</h3><p>我有时候会在想为什么小的时候，比如高中的时候我不会胖呢？那个时候吃的很多啊(已经不长个子了)，然后就回想起来哪个时候没事就瞎闹腾，交作业跑上跑下，一楼到四楼每天都跑来跑去，有时候还和同学你追我赶，并且去食堂吃夜宵还全速前进…</p><p>现在呢？办公室坐下，4小时不动，呵呵。</p><p>所以我渐渐开始培养了一些小的 “多动症” 习惯，比如：</p><ol><li>上班坐地铁，我会选择走楼梯，而不是电梯</li><li>公交如果还没来会选择远一点的站点，而不是近一点的</li><li>上厕所选择上一楼，而不是直接同楼层的</li><li>…</li></ol><p>我个人会认为，这些小的习惯，会让你的身体保持一定的活力，只要心脏多跳两下，你的身体就会多消耗一点热量。当然，我不知道能有多大帮助，我还是那句话，总比没有的好。</p><h2 id="总结"><a href="#总结" class="headerlink" title="总结"></a>总结</h2><p>通过我的减肥经历，我总结了以下几点要点：</p><ol><li>理性减肥：减肥方法繁多，要学会分辨有效的方法和谣言，理性看待减肥。</li><li>科学减肥：减肥的核心是制造热量缺口，让输出大于输入，运动和控制饮食是实现热量缺口的关键。</li><li>运动：跑步和有氧拳击是我选择的运动方式，跑步可以启动身体机能，有氧拳击可以在家中进行，激励效果好。</li><li>饮食控制：减少额外食物摄入，选择干净食物，避免摄入过多的热量，逐渐恢复正常饮食。</li><li>关注体重变化趋势：不过于关注体重数字本身，而是关注体重的变化趋势，调整饮食和运动计划。</li><li>培养多动习惯：通过一些小的习惯，如走楼梯、选择远一点的站点等，增加身体活动量，消耗更多热量。</li></ol><p>这些是我个人的经验和心得，希望对你有所帮助。记住，减肥是一个长期的过程，要坚持并根据自己的情况进行调整。祝你成功减肥！</p>]]>
    </content>
    
    
      
      
        
        
    <summary type="html">&lt;blockquote&gt;
&lt;p&gt;最近很长一段时间没有写文章了，因为这一段时间在做个“大事”，减肥。运动本身还是挺花时间的，所以确实就没时间更新那种非常干货的博客了。&lt;/p&gt;
&lt;/blockquote&gt;
&lt;h2 id=&quot;起因&quot;&gt;&lt;a href=&quot;#起因&quot;</summary>
        
      
    
    
    
    <category term="减肥" scheme="https://www.linkinstars.com/categories/%E5%87%8F%E8%82%A5/"/>
    
    
    <category term="lose-weight" scheme="https://www.linkinstars.com/tags/lose-weight/"/>
    
  </entry>
  
  <entry>
    <title>Docker 容器如何访问宿主机服务</title>
    <link href="https://www.linkinstars.com/post/c924c8b9.html"/>
    <id>https://www.linkinstars.com/post/c924c8b9.html</id>
    <published>2024-06-14T16:00:00.000Z</published>
    <updated>2024-07-12T06:25:11.661Z</updated>
    
    <content type="html">
    <![CDATA[<a href="https://www.linkinstars.com/post/c924c8b9.html">RSS 阅读体验可能不太好，若喜欢本文请点此跳转原文查看~</a><br><br>]]>
    <![CDATA[<h2 id="前言"><a href="#前言" class="headerlink" title="前言"></a>前言</h2><p>今天先水一篇，记录一个常见的问题，Docker 容器如何访问宿主机服务。我们知道在一个 docker 容器内部，如果，你直接访问 127.0.0.1 是无法访问到宿主机的，那么怎么办呢？</p><h3 id="最直接的方法"><a href="#最直接的方法" class="headerlink" title="最直接的方法"></a>最直接的方法</h3><p>如果你是自己在用，就是这个机器上也没有其他服务，那我倒是建议直接使用 host 网络，简单直接，一把梭。</p><figure class="highlight bash"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line">docker run -it --network host ubuntu:latest</span><br></pre></td></tr></table></figure><h3 id="优雅的解决"><a href="#优雅的解决" class="headerlink" title="优雅的解决"></a>优雅的解决</h3><p>然后我反找到了一个非常非常长的 issue 中间有个评论是 <a href="https://github.com/docker/for-linux/issues/264#issuecomment-964620100">https://github.com/docker/for-linux/issues/264#issuecomment-964620100</a> （github也是醉了中间隐藏的部分需要你点好几次才会全部展开）</p><p>如果你不想使用 host 网络，可以使用 <code>host.docker.internal</code> 来访问宿主机的服务。</p><p>docker 下可以使用</p><figure class="highlight bash"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line">docker run --add-host host.docker.internal:host-gateway</span><br></pre></td></tr></table></figure><p>docker-compose 下可以使用</p><figure class="highlight yaml"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br></pre></td><td class="code"><pre><span class="line"><span class="attr">services:</span></span><br><span class="line">  <span class="attr">myservice:</span></span><br><span class="line">    <span class="attr">extra_hosts:</span></span><br><span class="line">      <span class="bullet">-</span> <span class="string">host.docker.internal:host-gateway</span></span><br></pre></td></tr></table></figure><p>目前我测试下来 Linux 下是可以的，配置完成之后直接使用 <code>host.docker.internal</code> 就可以访问到宿主机的服务了。由于这个 issue 的时间非常长，所以这个方法适配的版本就不确定了，需要你自己测试下。</p>]]>
    </content>
    
    
      
      
        
        
    <summary type="html">&lt;h2 id=&quot;前言&quot;&gt;&lt;a href=&quot;#前言&quot; class=&quot;headerlink&quot; title=&quot;前言&quot;&gt;&lt;/a&gt;前言&lt;/h2&gt;&lt;p&gt;今天先水一篇，记录一个常见的问题，Docker 容器如何访问宿主机服务。我们知道在一个 docker 容器内部，如果，你直接访问</summary>
        
      
    
    
    
    <category term="docker" scheme="https://www.linkinstars.com/categories/docker/"/>
    
    
    <category term="docker" scheme="https://www.linkinstars.com/tags/docker/"/>
    
  </entry>
  
  <entry>
    <title>《一起读 kubernetes 源码》deployment 滚动更新是如何实现的</title>
    <link href="https://www.linkinstars.com/post/4b7330fb.html"/>
    <id>https://www.linkinstars.com/post/4b7330fb.html</id>
    <published>2024-05-31T16:00:00.000Z</published>
    <updated>2024-07-12T04:26:48.217Z</updated>
    
    <content type="html">
    <![CDATA[<a href="https://www.linkinstars.com/post/4b7330fb.html">RSS 阅读体验可能不太好，若喜欢本文请点此跳转原文查看~</a><br><br>]]>
    <![CDATA[<blockquote><p>📢 注意，该文本非最终版本，正在更新中，版权所有，请勿转载！！</p></blockquote><h2 id="前言"><a href="#前言" class="headerlink" title="前言"></a>前言</h2><p>这一节终于来到了我们最为熟悉的一个对象 deployment，通常这可能是我们学习 k8s 接触的第一个大对象了，我们一般的应用也是以 deployment 来进行部署的，那么对于熟悉的它来说，我们应该从源码里面去找什么目标来看呢？对于我来说，deployment 的更新是我最好奇的，在我重新修改镜像版本之后，deployment 是如何一步步控制已有的 pod 进行更新的呢？这一节我们就从源码中揭秘这个过程。</p><h2 id="前置知识"><a href="#前置知识" class="headerlink" title="前置知识"></a>前置知识</h2><ul><li>deployment 的基础使用</li><li>滚动更新</li></ul><h2 id="心路历程"><a href="#心路历程" class="headerlink" title="心路历程"></a>心路历程</h2><p>在我看来其他的属性与 pod 类似，而 deployment 作为一个 pod 的集合。那，<strong>为什么 deployment 要让 pod 的有多个副本呢</strong>？从最初的角度角度来说肯定是高可用了，所以 deployment 中<strong>最为关键的就是对 pod 的控制</strong>，也就是当 pod 的数量变化的时候，它是如何操作的。</p><h2 id="码前提问"><a href="#码前提问" class="headerlink" title="码前提问"></a>码前提问</h2><ol><li>deployment 是由哪个对象控制的？</li><li>应用更新的时候 deployment 是如何控制更新过程的？</li></ol><h2 id="源码分析"><a href="#源码分析" class="headerlink" title="源码分析"></a>源码分析</h2><h3 id="寻码过程"><a href="#寻码过程" class="headerlink" title="寻码过程"></a>寻码过程</h3><p>像 deployment 这样的源码比起其他就好找很多了，毕竟命名比较直接。在看来前几节之后，我不知道你是否发现了一个规律。通常看源码的正向思路可以被总结为：</p><ol><li>找到对应实现的数据结构，通常是一个或多个结构体</li><li>看它的初始化，初始化能告诉你其中哪些必要的准备步骤，和具体一些字段的基础能力</li><li>看它的方法，通常就能知道你想要具体实现原理了</li></ol><h3 id="Deployment-结构"><a href="#Deployment-结构" class="headerlink" title="Deployment 结构"></a>Deployment 结构</h3><p>话不多说，先找到它的数据结构</p><figure class="highlight go"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment">// vendor/k8s.io/api/apps/v1/types.go:355</span></span><br><span class="line"><span class="keyword">type</span> Deployment <span class="keyword">struct</span> &#123;</span><br><span class="line">metav1.TypeMeta <span class="string">`json:&quot;,inline&quot;`</span></span><br><span class="line"><span class="comment">// Standard object&#x27;s metadata.</span></span><br><span class="line"><span class="comment">// More info: https://git.k8s.io/community/contributors/devel/sig-architecture/api-conventions.md#metadata</span></span><br><span class="line"><span class="comment">// +optional</span></span><br><span class="line">metav1.ObjectMeta <span class="string">`json:&quot;metadata,omitempty&quot; protobuf:&quot;bytes,1,opt,name=metadata&quot;`</span></span><br><span class="line"></span><br><span class="line"><span class="comment">// Specification of the desired behavior of the Deployment.</span></span><br><span class="line"><span class="comment">// +optional</span></span><br><span class="line">Spec DeploymentSpec <span class="string">`json:&quot;spec,omitempty&quot; protobuf:&quot;bytes,2,opt,name=spec&quot;`</span></span><br><span class="line"></span><br><span class="line"><span class="comment">// Most recently observed status of the Deployment.</span></span><br><span class="line"><span class="comment">// +optional</span></span><br><span class="line">Status DeploymentStatus <span class="string">`json:&quot;status,omitempty&quot; protobuf:&quot;bytes,3,opt,name=status&quot;`</span></span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure><p>可以看到，数据结构的定义和我们平常使用的 yaml 文件的定义是一一对应的，非常容易理解，可以简单浏览一下 Spec 的属性。</p><p>那么关键的问题来了，是哪个结构在控制 deployment？于是开始寻找 Deployment 的引用，看哪些位置在使用这个数据结构，引用很多，但是你只需要按文件去看就可以了。</p><blockquote><p>阅读其他源码时，如果一个对象的引用，不要去寻找每一个代码引用的位置，而应该先从文件入手，如果引用的文件还是很多，可以从包的角度入手，一个包下通常能力方向也是类似的</p></blockquote><p>于是，我找到了 <code>DeploymentController</code> 这个关键的对象(看命名也应该是它了，控制器嘛)，今天我们后面就是围绕着它展开的。注意哦，</p><h3 id="DeploymentController"><a href="#DeploymentController" class="headerlink" title="DeploymentController"></a>DeploymentController</h3><h4 id="DeploymentController-结构"><a href="#DeploymentController-结构" class="headerlink" title="DeploymentController 结构"></a>DeploymentController 结构</h4><p>还是类似的，先来看看结构</p><figure class="highlight go"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br><span class="line">29</span><br><span class="line">30</span><br><span class="line">31</span><br><span class="line">32</span><br><span class="line">33</span><br><span class="line">34</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment">// pkg/controller/deployment/deployment_controller.go:66</span></span><br><span class="line"><span class="keyword">type</span> DeploymentController <span class="keyword">struct</span> &#123;</span><br><span class="line"><span class="comment">// rsControl is used for adopting/releasing replica sets.</span></span><br><span class="line">rsControl controller.RSControlInterface</span><br><span class="line">client    clientset.Interface</span><br><span class="line"></span><br><span class="line">eventBroadcaster record.EventBroadcaster</span><br><span class="line">eventRecorder    record.EventRecorder</span><br><span class="line"></span><br><span class="line"><span class="comment">// To allow injection of syncDeployment for testing.</span></span><br><span class="line">syncHandler <span class="function"><span class="keyword">func</span><span class="params">(ctx context.Context, dKey <span class="type">string</span>)</span></span> <span class="type">error</span></span><br><span class="line"><span class="comment">// used for unit testing</span></span><br><span class="line">enqueueDeployment <span class="function"><span class="keyword">func</span><span class="params">(deployment *apps.Deployment)</span></span></span><br><span class="line"></span><br><span class="line"><span class="comment">// dLister can list/get deployments from the shared informer&#x27;s store</span></span><br><span class="line">dLister appslisters.DeploymentLister</span><br><span class="line"><span class="comment">// rsLister can list/get replica sets from the shared informer&#x27;s store</span></span><br><span class="line">rsLister appslisters.ReplicaSetLister</span><br><span class="line"><span class="comment">// podLister can list/get pods from the shared informer&#x27;s store</span></span><br><span class="line">podLister corelisters.PodLister</span><br><span class="line"></span><br><span class="line"><span class="comment">// dListerSynced returns true if the Deployment store has been synced at least once.</span></span><br><span class="line"><span class="comment">// Added as a member to the struct to allow injection for testing.</span></span><br><span class="line">dListerSynced cache.InformerSynced</span><br><span class="line"><span class="comment">// rsListerSynced returns true if the ReplicaSet store has been synced at least once.</span></span><br><span class="line"><span class="comment">// Added as a member to the struct to allow injection for testing.</span></span><br><span class="line">rsListerSynced cache.InformerSynced</span><br><span class="line"><span class="comment">// podListerSynced returns true if the pod store has been synced at least once.</span></span><br><span class="line"><span class="comment">// Added as a member to the struct to allow injection for testing.</span></span><br><span class="line">podListerSynced cache.InformerSynced</span><br><span class="line"></span><br><span class="line"><span class="comment">// Deployments that need to be synced</span></span><br><span class="line">queue workqueue.RateLimitingInterface</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure><p>注意两个点就好，一个是 <code>syncHandler</code> 还有一个是 <code>queue</code> 看到这两个字段我心里其实已经有个大概的思路了。下面就要用到我们在第一节提到的 informer 机制了。</p><h4 id="NewDeploymentController"><a href="#NewDeploymentController" class="headerlink" title="NewDeploymentController"></a>NewDeploymentController</h4><p>初始化是在 <code>NewDeploymentController</code> 方法中，我省略了其中一些部分，留下了重要的几个例子</p><figure class="highlight go"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br><span class="line">29</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment">// pkg/controller/deployment/deployment_controller.go:101</span></span><br><span class="line"><span class="function"><span class="keyword">func</span> <span class="title">NewDeploymentController</span><span class="params">(ctx context.Context, dInformer appsinformers.DeploymentInformer, rsInformer appsinformers.ReplicaSetInformer, podInformer coreinformers.PodInformer, client clientset.Interface)</span></span> (*DeploymentController, <span class="type">error</span>) &#123;</span><br><span class="line"><span class="comment">//....</span></span><br><span class="line">dc := &amp;DeploymentController&#123;</span><br><span class="line"><span class="comment">//....</span></span><br><span class="line">queue:            workqueue.NewNamedRateLimitingQueue(workqueue.DefaultControllerRateLimiter(), <span class="string">&quot;deployment&quot;</span>),</span><br><span class="line">&#125;</span><br><span class="line"><span class="comment">//....</span></span><br><span class="line"></span><br><span class="line">dInformer.Informer().AddEventHandler(cache.ResourceEventHandlerFuncs&#123;</span><br><span class="line">AddFunc: <span class="function"><span class="keyword">func</span><span class="params">(obj <span class="keyword">interface</span>&#123;&#125;)</span></span> &#123;</span><br><span class="line">dc.addDeployment(logger, obj)</span><br><span class="line">&#125;,</span><br><span class="line">UpdateFunc: <span class="function"><span class="keyword">func</span><span class="params">(oldObj, newObj <span class="keyword">interface</span>&#123;&#125;)</span></span> &#123;</span><br><span class="line">dc.updateDeployment(logger, oldObj, newObj)</span><br><span class="line">&#125;,</span><br><span class="line"><span class="comment">// This will enter the sync loop and no-op, because the deployment has been deleted from the store.</span></span><br><span class="line">DeleteFunc: <span class="function"><span class="keyword">func</span><span class="params">(obj <span class="keyword">interface</span>&#123;&#125;)</span></span> &#123;</span><br><span class="line">dc.deleteDeployment(logger, obj)</span><br><span class="line">&#125;,</span><br><span class="line">&#125;)</span><br><span class="line"><span class="comment">//....</span></span><br><span class="line"></span><br><span class="line">dc.syncHandler = dc.syncDeployment</span><br><span class="line">dc.enqueueDeployment = dc.enqueue</span><br><span class="line"></span><br><span class="line"><span class="comment">//....</span></span><br><span class="line"><span class="keyword">return</span> dc, <span class="literal">nil</span></span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure><p>有了前面的知识，这里的代码我们就很容易理解了，关键是在于注册了有个 <code>ResourceEvent</code> 处理的各种能力，比如当 Add 事件来的时候，调用 <code>addDeployment</code>。先留心注意下面的两个部分 <code>syncHandler</code> 和 <code>enqueueDeployment</code> 后面会用到。接下来我们肯定会好奇，<code>addDeployment</code> 究竟是如何处理这个事件的，所以我们继续深入看里面的实现。</p><h4 id="addDeployment"><a href="#addDeployment" class="headerlink" title="addDeployment"></a>addDeployment</h4><p>这里面的调用链路很清晰：<code>addDeployment</code> -&gt; <code>enqueueDeployment</code> -&gt; <code>enqueue</code> -&gt; <code>dc.queue.Add(key)</code></p><figure class="highlight go"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment">// pkg/controller/deployment/deployment_controller.go:391</span></span><br><span class="line"><span class="function"><span class="keyword">func</span> <span class="params">(dc *DeploymentController)</span></span> enqueue(deployment *apps.Deployment) &#123;</span><br><span class="line">key, err := controller.KeyFunc(deployment)</span><br><span class="line"><span class="keyword">if</span> err != <span class="literal">nil</span> &#123;</span><br><span class="line">utilruntime.HandleError(fmt.Errorf(<span class="string">&quot;couldn&#x27;t get key for object %#v: %v&quot;</span>, deployment, err))</span><br><span class="line"><span class="keyword">return</span></span><br><span class="line">&#125;</span><br><span class="line"></span><br><span class="line">dc.queue.Add(key)</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure><p>其实这些处理的工作，<strong>将 deployment 对应的 key 丢到队列里面去</strong>，所以下面我们只需要找到哪里在处理队列中的消息就可以了</p><h4 id="Run"><a href="#Run" class="headerlink" title="Run"></a>Run</h4><p>地方也很好找，是在 <code>Run</code> 里面，运行的时候启动了一定数量的 worker，然后每个 worker 循环去取消息。</p><figure class="highlight go"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment">// pkg/controller/deployment/deployment_controller.go:157</span></span><br><span class="line"><span class="comment">// Run begins watching and syncing.</span></span><br><span class="line"><span class="function"><span class="keyword">func</span> <span class="params">(dc *DeploymentController)</span></span> Run(ctx context.Context, workers <span class="type">int</span>) &#123;</span><br><span class="line"><span class="comment">//...</span></span><br><span class="line"></span><br><span class="line"><span class="keyword">for</span> i := <span class="number">0</span>; i &lt; workers; i++ &#123;</span><br><span class="line"><span class="keyword">go</span> wait.UntilWithContext(ctx, dc.worker, time.Second)</span><br><span class="line">&#125;</span><br><span class="line"></span><br><span class="line">&lt;-ctx.Done()</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure><figure class="highlight go"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment">// pkg/controller/deployment/deployment_controller.go:473</span></span><br><span class="line"><span class="function"><span class="keyword">func</span> <span class="params">(dc *DeploymentController)</span></span> worker(ctx context.Context) &#123;</span><br><span class="line"><span class="keyword">for</span> dc.processNextWorkItem(ctx) &#123;</span><br><span class="line">&#125;</span><br><span class="line">&#125;</span><br><span class="line"></span><br><span class="line"><span class="function"><span class="keyword">func</span> <span class="params">(dc *DeploymentController)</span></span> processNextWorkItem(ctx context.Context) <span class="type">bool</span> &#123;</span><br><span class="line">key, quit := dc.queue.Get()</span><br><span class="line"><span class="keyword">if</span> quit &#123;</span><br><span class="line"><span class="keyword">return</span> <span class="literal">false</span></span><br><span class="line">&#125;</span><br><span class="line"><span class="keyword">defer</span> dc.queue.Done(key)</span><br><span class="line"></span><br><span class="line">err := dc.syncHandler(ctx, key.(<span class="type">string</span>))</span><br><span class="line">dc.handleErr(ctx, err, key)</span><br><span class="line"></span><br><span class="line"><span class="keyword">return</span> <span class="literal">true</span></span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure><p>整个路径就是：<code>Run</code> -&gt; <code>worker</code> -&gt; <code>processNextWorkItem</code> -&gt; <code>syncHandler</code> 。</p><p>可以看到就是一个标准的生产者消费者模型。然后关键就来到了 <code>syncHandler</code> 变量，还记得 <code>dc.syncHandler = dc.syncDeployment</code> 吗？对的，它在初始化时候被赋值为了 <code>syncDeployment</code> 这就到了我们这一节的重点方法了，注意看。</p><h3 id="syncDeployment"><a href="#syncDeployment" class="headerlink" title="syncDeployment"></a>syncDeployment</h3><p>这里我不想省略太多的代码，因为它本身是一个顺序结构，很容易理解。</p><figure class="highlight go"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br><span class="line">29</span><br><span class="line">30</span><br><span class="line">31</span><br><span class="line">32</span><br><span class="line">33</span><br><span class="line">34</span><br><span class="line">35</span><br><span class="line">36</span><br><span class="line">37</span><br><span class="line">38</span><br><span class="line">39</span><br><span class="line">40</span><br><span class="line">41</span><br><span class="line">42</span><br><span class="line">43</span><br><span class="line">44</span><br><span class="line">45</span><br><span class="line">46</span><br><span class="line">47</span><br><span class="line">48</span><br><span class="line">49</span><br><span class="line">50</span><br><span class="line">51</span><br><span class="line">52</span><br><span class="line">53</span><br><span class="line">54</span><br><span class="line">55</span><br><span class="line">56</span><br><span class="line">57</span><br><span class="line">58</span><br><span class="line">59</span><br><span class="line">60</span><br><span class="line">61</span><br><span class="line">62</span><br><span class="line">63</span><br><span class="line">64</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment">// pkg/controller/deployment/deployment_controller.go:581</span></span><br><span class="line"><span class="function"><span class="keyword">func</span> <span class="params">(dc *DeploymentController)</span></span> syncDeployment(ctx context.Context, key <span class="type">string</span>) <span class="type">error</span> &#123;</span><br><span class="line"><span class="comment">//...</span></span><br><span class="line"></span><br><span class="line">deployment, err := dc.dLister.Deployments(namespace).Get(name)</span><br><span class="line"><span class="keyword">if</span> errors.IsNotFound(err) &#123;</span><br><span class="line">logger.V(<span class="number">2</span>).Info(<span class="string">&quot;Deployment has been deleted&quot;</span>, <span class="string">&quot;deployment&quot;</span>, klog.KRef(namespace, name))</span><br><span class="line"><span class="keyword">return</span> <span class="literal">nil</span></span><br><span class="line">&#125;</span><br><span class="line"><span class="keyword">if</span> err != <span class="literal">nil</span> &#123;</span><br><span class="line"><span class="keyword">return</span> err</span><br><span class="line">&#125;</span><br><span class="line"></span><br><span class="line"><span class="comment">// Deep-copy otherwise we are mutating our cache.</span></span><br><span class="line"><span class="comment">// <span class="doctag">TODO:</span> Deep-copy only when needed.</span></span><br><span class="line">d := deployment.DeepCopy()</span><br><span class="line"></span><br><span class="line"><span class="comment">//...</span></span><br><span class="line"></span><br><span class="line"><span class="comment">// List ReplicaSets owned by this Deployment, while reconciling ControllerRef</span></span><br><span class="line"><span class="comment">// through adoption/orphaning.</span></span><br><span class="line">rsList, err := dc.getReplicaSetsForDeployment(ctx, d)</span><br><span class="line"><span class="keyword">if</span> err != <span class="literal">nil</span> &#123;</span><br><span class="line"><span class="keyword">return</span> err</span><br><span class="line">&#125;</span><br><span class="line"><span class="comment">// List all Pods owned by this Deployment, grouped by their ReplicaSet.</span></span><br><span class="line"><span class="comment">// Current uses of the podMap are:</span></span><br><span class="line"><span class="comment">//</span></span><br><span class="line"><span class="comment">// * check if a Pod is labeled correctly with the pod-template-hash label.</span></span><br><span class="line"><span class="comment">// * check that no old Pods are running in the middle of Recreate Deployments.</span></span><br><span class="line">podMap, err := dc.getPodMapForDeployment(d, rsList)</span><br><span class="line"><span class="keyword">if</span> err != <span class="literal">nil</span> &#123;</span><br><span class="line"><span class="keyword">return</span> err</span><br><span class="line">&#125;</span><br><span class="line"></span><br><span class="line"><span class="comment">//...</span></span><br><span class="line"></span><br><span class="line"><span class="keyword">if</span> d.Spec.Paused &#123;</span><br><span class="line"><span class="keyword">return</span> dc.sync(ctx, d, rsList)</span><br><span class="line">&#125;</span><br><span class="line"></span><br><span class="line"><span class="comment">// rollback is not re-entrant in case the underlying replica sets are updated with a new</span></span><br><span class="line"><span class="comment">// revision so we should ensure that we won&#x27;t proceed to update replica sets until we</span></span><br><span class="line"><span class="comment">// make sure that the deployment has cleaned up its rollback spec in subsequent enqueues.</span></span><br><span class="line"><span class="keyword">if</span> getRollbackTo(d) != <span class="literal">nil</span> &#123;</span><br><span class="line"><span class="keyword">return</span> dc.rollback(ctx, d, rsList)</span><br><span class="line">&#125;</span><br><span class="line"></span><br><span class="line">scalingEvent, err := dc.isScalingEvent(ctx, d, rsList)</span><br><span class="line"><span class="keyword">if</span> err != <span class="literal">nil</span> &#123;</span><br><span class="line"><span class="keyword">return</span> err</span><br><span class="line">&#125;</span><br><span class="line"><span class="keyword">if</span> scalingEvent &#123;</span><br><span class="line"><span class="keyword">return</span> dc.sync(ctx, d, rsList)</span><br><span class="line">&#125;</span><br><span class="line"></span><br><span class="line"><span class="keyword">switch</span> d.Spec.Strategy.Type &#123;</span><br><span class="line"><span class="keyword">case</span> apps.RecreateDeploymentStrategyType:</span><br><span class="line"><span class="keyword">return</span> dc.rolloutRecreate(ctx, d, rsList, podMap)</span><br><span class="line"><span class="keyword">case</span> apps.RollingUpdateDeploymentStrategyType:</span><br><span class="line"><span class="keyword">return</span> dc.rolloutRolling(ctx, d, rsList)</span><br><span class="line">&#125;</span><br><span class="line"><span class="keyword">return</span> fmt.Errorf(<span class="string">&quot;unexpected deployment strategy type: %s&quot;</span>, d.Spec.Strategy.Type)</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure><ol><li>第一步就是找到 deployment</li><li>第二步是找到 rsList 也是我们说的 RS</li><li>然后找到 podMap</li></ol><p>寻找完了之后就开始根据状态进行操作。有哪些操作呢？</p><ul><li>rollback 回滚</li><li>scaling 判断现在是不是在调整大小</li><li>rollout 关键来了，这就是更新，有两种模式<ul><li>Recreate 重建</li><li>Rolling 滚动更新</li></ul></li></ul><p>这里我们最关心的策略终于暴露出来了，那就是滚动更新了，我们赶快来看看里面是怎么实现的。</p><h3 id="rolloutRolling"><a href="#rolloutRolling" class="headerlink" title="rolloutRolling"></a>rolloutRolling</h3><figure class="highlight go"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br><span class="line">29</span><br><span class="line">30</span><br><span class="line">31</span><br><span class="line">32</span><br><span class="line">33</span><br><span class="line">34</span><br><span class="line">35</span><br><span class="line">36</span><br><span class="line">37</span><br><span class="line">38</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment">// pkg/controller/deployment/rolling.go:31</span></span><br><span class="line"><span class="comment">// rolloutRolling implements the logic for rolling a new replica set.</span></span><br><span class="line"><span class="function"><span class="keyword">func</span> <span class="params">(dc *DeploymentController)</span></span> rolloutRolling(ctx context.Context, d *apps.Deployment, rsList []*apps.ReplicaSet) <span class="type">error</span> &#123;</span><br><span class="line">newRS, oldRSs, err := dc.getAllReplicaSetsAndSyncRevision(ctx, d, rsList, <span class="literal">true</span>)</span><br><span class="line"><span class="keyword">if</span> err != <span class="literal">nil</span> &#123;</span><br><span class="line"><span class="keyword">return</span> err</span><br><span class="line">&#125;</span><br><span class="line">allRSs := <span class="built_in">append</span>(oldRSs, newRS)</span><br><span class="line"></span><br><span class="line"><span class="comment">// Scale up, if we can.</span></span><br><span class="line">scaledUp, err := dc.reconcileNewReplicaSet(ctx, allRSs, newRS, d)</span><br><span class="line"><span class="keyword">if</span> err != <span class="literal">nil</span> &#123;</span><br><span class="line"><span class="keyword">return</span> err</span><br><span class="line">&#125;</span><br><span class="line"><span class="keyword">if</span> scaledUp &#123;</span><br><span class="line"><span class="comment">// Update DeploymentStatus</span></span><br><span class="line"><span class="keyword">return</span> dc.syncRolloutStatus(ctx, allRSs, newRS, d)</span><br><span class="line">&#125;</span><br><span class="line"></span><br><span class="line"><span class="comment">// Scale down, if we can.</span></span><br><span class="line">scaledDown, err := dc.reconcileOldReplicaSets(ctx, allRSs, controller.FilterActiveReplicaSets(oldRSs), newRS, d)</span><br><span class="line"><span class="keyword">if</span> err != <span class="literal">nil</span> &#123;</span><br><span class="line"><span class="keyword">return</span> err</span><br><span class="line">&#125;</span><br><span class="line"><span class="keyword">if</span> scaledDown &#123;</span><br><span class="line"><span class="comment">// Update DeploymentStatus</span></span><br><span class="line"><span class="keyword">return</span> dc.syncRolloutStatus(ctx, allRSs, newRS, d)</span><br><span class="line">&#125;</span><br><span class="line"></span><br><span class="line"><span class="keyword">if</span> deploymentutil.DeploymentComplete(d, &amp;d.Status) &#123;</span><br><span class="line"><span class="keyword">if</span> err := dc.cleanupDeployment(ctx, oldRSs, d); err != <span class="literal">nil</span> &#123;</span><br><span class="line"><span class="keyword">return</span> err</span><br><span class="line">&#125;</span><br><span class="line">&#125;</span><br><span class="line"></span><br><span class="line"><span class="comment">// Sync deployment status</span></span><br><span class="line"><span class="keyword">return</span> dc.syncRolloutStatus(ctx, allRSs, newRS, d)</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure><p>步骤其实比我想的要简单：</p><ol><li>得到新旧的 RS 进行比较</li><li>先看要不要扩副本数，如果要则直接扩，并且就直接 <code>sync</code> 了不继续了</li><li>然后才轮到缩容，能操作就直接操作了。</li></ol><p>然后，我们来回忆一下 pod 数量在实际更新中的变动过程，如果目前的 pod 是 3&#x2F;3(目标&#x2F;现有)，那么扩容之后就会变成 3&#x2F;4，此时下一次进来就不能扩了，只能变成缩了变成 3&#x2F;3 然后不断往复，直到所以 pod 都满足期望要求的版本。想想真的蛮奇妙的，就是利用了简单的状态管理就实现了整个滚动更新过程，慢慢的就靠近了目标。<strong>这可能就是状态机的优雅吧</strong>，你只管改状态，剩下的协调交给我。</p><h2 id="码后解答"><a href="#码后解答" class="headerlink" title="码后解答"></a>码后解答</h2><ol><li>deployment 是由哪个对象控制的？<ol><li><code>DeploymentController</code></li></ol></li><li>应用更新的时候 deployment 是如何控制更新过程的？<ol><li>关键其实就在于：<code>rolloutRolling</code> ，将<strong>目标态</strong>的 pod 添加，打破平衡(状态变化)，将不满足的旧状态移除，从而慢慢协调到最终状态。再说的简单一点：先尝试 <code>scaledUp</code> 然后尝试 <code>scaledDown</code></li></ol></li></ol><h2 id="总结提升"><a href="#总结提升" class="headerlink" title="总结提升"></a>总结提升</h2><h3 id="设计上"><a href="#设计上" class="headerlink" title="设计上"></a>设计上</h3><p>deployment 这里我们能学到哪些设计上的提升点呢？我个人有下面几个</p><ol><li>首先就是 <code>NewDeploymentController</code> 里面对于 Informer 机制的运用</li><li>利用策略模式 <code>RollingUpdate</code> 和 <code>Recreate</code> 两种不同实现很清晰</li><li>利用状态的管理来构建当前 <code>rolloutRolling</code> 的操作，对于编码来说清晰</li></ol><p>熟悉了这部分的实现，那么对于其他对象类似的功能，我觉得你应该也能有自己的把握了。</p>]]>
    </content>
    
    
      
      
        
        
    <summary type="html">&lt;blockquote&gt;
&lt;p&gt;📢 注意，该文本非最终版本，正在更新中，版权所有，请勿转载！！&lt;/p&gt;
&lt;/blockquote&gt;
&lt;h2 id=&quot;前言&quot;&gt;&lt;a href=&quot;#前言&quot; class=&quot;headerlink&quot;</summary>
        
      
    
    
    
    <category term="kubernetes" scheme="https://www.linkinstars.com/categories/kubernetes/"/>
    
    
    <category term="kubernetes" scheme="https://www.linkinstars.com/tags/kubernetes/"/>
    
  </entry>
  
  <entry>
    <title>《一起读 kubernetes 源码》揭秘 k8s 关键机制 informer</title>
    <link href="https://www.linkinstars.com/post/6a76c1cb.html"/>
    <id>https://www.linkinstars.com/post/6a76c1cb.html</id>
    <published>2024-04-29T16:00:00.000Z</published>
    <updated>2024-04-30T07:53:59.000Z</updated>
    
    <content type="html">
    <![CDATA[<a href="https://www.linkinstars.com/post/6a76c1cb.html">RSS 阅读体验可能不太好，若喜欢本文请点此跳转原文查看~</a><br><br>]]>
    <![CDATA[<blockquote><p>📢 注意，该文本非最终版本，正在更新中，版权所有，请勿转载！！</p></blockquote><h2 id="前言"><a href="#前言" class="headerlink" title="前言"></a>前言</h2><p>在第二章我们会去看 k8s 中常用对象的源码，不过在看这些对象之前，我们需要聊一聊 informer 机制。这个机制可以说是 k8s 设计之中的一个重点了。这个机制的设计不仅仅让代码本身变得清晰，更让整个系统的结构更容易扩展。所以这个机制需要放到第二章的第一节来说。</p><h2 id="前置知识"><a href="#前置知识" class="headerlink" title="前置知识"></a>前置知识</h2><ul><li>控制循环</li><li>informer 的使用</li></ul><h2 id="心路历程"><a href="#心路历程" class="headerlink" title="心路历程"></a>心路历程</h2><p>我第一接触 informer 是在使用 client-go 的时候。相信有很多同学和我一样，学习 k8s 的路径通常是，从基本的使用开始，然后慢慢的有一些自定义的需求需要使用 client-go 进行开发。使用 client-go 开发真的很方便，能力很强大。而在其中我第一次碰到了 informer。从了解了这个机制之后，才逐渐明白 k8s 本身是如何去控制里面的资源的。</p><p>还是一样的，本文不涉及具体这个机制的详细原理，更专注在源码本身。当然，我先通过两个小点帮助你回忆起来 informer 机制。</p><h3 id="控制循环"><a href="#控制循环" class="headerlink" title="控制循环"></a>控制循环</h3><p>首先是控制循环，这个我认为是 k8s 的精髓，它通过一个循环来让整个系统趋向与我们申明的一个期望状态。 <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/sig-api-machinery/controllers.md#writing-controllers">https://github.com/kubernetes/community/blob/master/contributors/devel/sig-api-machinery/controllers.md#writing-controllers</a></p><figure class="highlight plaintext"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br></pre></td><td class="code"><pre><span class="line">for &#123;</span><br><span class="line">    实际状态 := 获取集群中对象 X 的实际状态(Actual State)</span><br><span class="line">    期望状态 := 获取集群中对象 X 的期望状态(Expectation State)</span><br><span class="line">    if 实际状态 == 期望状态&#123;</span><br><span class="line">        什么都不做</span><br><span class="line">    &#125; else &#123;</span><br><span class="line">        执行编排动作，将实际状态调整为期望状态</span><br><span class="line">    &#125;</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure><h3 id="informer-的用法"><a href="#informer-的用法" class="headerlink" title="informer 的用法"></a>informer 的用法</h3><p>下面的例子说明了 informer 的用法 <a href="https://github.com/kubernetes/community/blob/master/contributors/devel/sig-api-machinery/controllers.md#rough-structure">https://github.com/kubernetes/community/blob/master/contributors/devel/sig-api-machinery/controllers.md#rough-structure</a></p><figure class="highlight go"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br></pre></td><td class="code"><pre><span class="line"><span class="function"><span class="keyword">func</span> <span class="title">NewController</span><span class="params">(pods informers.PodInformer)</span></span> *Controller &#123;</span><br><span class="line">    c := &amp;Controller&#123;</span><br><span class="line">        pods: pods.Lister(),</span><br><span class="line">        podsSynced: pods.Informer().HasSynced,</span><br><span class="line">        queue: workqueue.NewNamedRateLimitingQueue(workqueue.DefaultControllerRateLimiter(), <span class="string">&quot;controller-name&quot;</span>),</span><br><span class="line">    &#125;</span><br><span class="line"></span><br><span class="line">    pods.Informer().AddEventHandler(cache.ResourceEventHandlerFuncs&#123;</span><br><span class="line">        AddFunc: <span class="function"><span class="keyword">func</span><span class="params">(obj <span class="keyword">interface</span>&#123;&#125;)</span></span> &#123;</span><br><span class="line">            <span class="comment">// ...</span></span><br><span class="line">        &#125;,</span><br><span class="line">        UpdateFunc: <span class="function"><span class="keyword">func</span><span class="params">(old <span class="keyword">interface</span>&#123;&#125;, <span class="built_in">new</span> <span class="keyword">interface</span>&#123;&#125;)</span></span> &#123;</span><br><span class="line">            <span class="comment">// ...</span></span><br><span class="line">        &#125;,</span><br><span class="line">        DeleteFunc: <span class="function"><span class="keyword">func</span><span class="params">(obj <span class="keyword">interface</span>&#123;&#125;)</span></span> &#123;</span><br><span class="line">            <span class="comment">// ...</span></span><br><span class="line">        &#125;,</span><br><span class="line">    &#125;,)</span><br><span class="line"></span><br><span class="line">    <span class="keyword">return</span> c</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure><p>其他你都不需要看，关键在于 <code>AddEventHandler</code> ，看到它你就知道了，<strong>本质就是让你能监听一些变动的事件</strong>，当事件来的时候，你就会知道，具体你知道了之后干嘛，就是你的事情了。</p><h3 id="informer-的流程图"><a href="#informer-的流程图" class="headerlink" title="informer 的流程图"></a>informer 的流程图</h3><p>下面这个图非常清晰的说明了 informer 的流程和关系，<strong>记住这个图片，后面还会用到</strong></p><p><img src="https://blog.linkinstars.com/blog/k8s-informer-flow.png" alt="k8s-informer-flow.png"></p><h3 id="控制对象的思考"><a href="#控制对象的思考" class="headerlink" title="控制对象的思考"></a>控制对象的思考</h3><p>结合以上回忆，思考一下：如果我们希望去控制一个对象，那么我们需要知道这个对象现在的状态是什么，或者知道它发生了什么变化，变化能不能满足我们的期望，如果不满足应该怎么调整。<strong>那么，想要知道一个对象的状态有两种方式，一种是你主动去查询，对吧，而另一种就是让别人告诉你。而 informer 就是后一种。</strong></p><h2 id="码前提问"><a href="#码前提问" class="headerlink" title="码前提问"></a>码前提问</h2><ol><li>informer 有那几个组件？</li><li>informer 机制是怎么样的？</li><li>为什么需要 informer？</li></ol><h2 id="源码分析"><a href="#源码分析" class="headerlink" title="源码分析"></a>源码分析</h2><p>这次寻码的原因和之前不太一样，之前我们都是为了看某一个东西的源码，就去搜索相关的代码。而这次是在 client-go 的使用过程中产生的好奇，所以从使用的角度，就很容易去寻码了。因为你需要使用这个方法去创建一个 informer ，那么你就会想知道里面究竟发生了什么对吧？那么这次我们就从这个方法 <code>NewIndexerInformer</code> 的 <code>newInformer</code> 开始。</p><h3 id="初始化"><a href="#初始化" class="headerlink" title="初始化"></a>初始化</h3><p>从初始化我们就可以知道 informer 里面究竟有什么。</p><figure class="highlight go"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment">// staging/src/k8s.io/client-go/tools/cache/controller.go:380</span></span><br><span class="line"><span class="function"><span class="keyword">func</span> <span class="title">NewIndexerInformer</span><span class="params">(</span></span></span><br><span class="line"><span class="params"><span class="function">lw ListerWatcher,</span></span></span><br><span class="line"><span class="params"><span class="function">objType runtime.Object,</span></span></span><br><span class="line"><span class="params"><span class="function">resyncPeriod time.Duration,</span></span></span><br><span class="line"><span class="params"><span class="function">h ResourceEventHandler,</span></span></span><br><span class="line"><span class="params"><span class="function">indexers Indexers,</span></span></span><br><span class="line"><span class="params"><span class="function">)</span></span> (Indexer, Controller) &#123;</span><br><span class="line"><span class="comment">// This will hold the client state, as we know it.</span></span><br><span class="line">clientState := NewIndexer(DeletionHandlingMetaNamespaceKeyFunc, indexers)</span><br><span class="line"></span><br><span class="line"><span class="keyword">return</span> clientState, newInformer(lw, objType, resyncPeriod, h, clientState, <span class="literal">nil</span>)</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure><p>这个方法可以帮助我们创建 <code>indexer</code> 和 <code>informer</code> 我们先不管什么是 <code>indexer</code>。直接 <code>newInformer</code>。</p><figure class="highlight go"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br><span class="line">29</span><br><span class="line">30</span><br><span class="line">31</span><br><span class="line">32</span><br><span class="line">33</span><br><span class="line">34</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment">// staging/src/k8s.io/client-go/tools/cache/controller.go:483</span></span><br><span class="line"><span class="function"><span class="keyword">func</span> <span class="title">newInformer</span><span class="params">(</span></span></span><br><span class="line"><span class="params"><span class="function">lw ListerWatcher,</span></span></span><br><span class="line"><span class="params"><span class="function">objType runtime.Object,</span></span></span><br><span class="line"><span class="params"><span class="function">resyncPeriod time.Duration,</span></span></span><br><span class="line"><span class="params"><span class="function">h ResourceEventHandler,</span></span></span><br><span class="line"><span class="params"><span class="function">clientState Store,</span></span></span><br><span class="line"><span class="params"><span class="function">transformer TransformFunc,</span></span></span><br><span class="line"><span class="params"><span class="function">)</span></span> Controller &#123;</span><br><span class="line"><span class="comment">// This will hold incoming changes. Note how we pass clientState in as a</span></span><br><span class="line"><span class="comment">// KeyLister, that way resync operations will result in the correct set</span></span><br><span class="line"><span class="comment">// of update/delete deltas.</span></span><br><span class="line">fifo := NewDeltaFIFOWithOptions(DeltaFIFOOptions&#123;</span><br><span class="line">KnownObjects:          clientState,</span><br><span class="line">EmitDeltaTypeReplaced: <span class="literal">true</span>,</span><br><span class="line">Transformer:           transformer,</span><br><span class="line">&#125;)</span><br><span class="line"></span><br><span class="line">cfg := &amp;Config&#123;</span><br><span class="line">Queue:            fifo,</span><br><span class="line">ListerWatcher:    lw,</span><br><span class="line">ObjectType:       objType,</span><br><span class="line">FullResyncPeriod: resyncPeriod,</span><br><span class="line">RetryOnError:     <span class="literal">false</span>,</span><br><span class="line"></span><br><span class="line">Process: <span class="function"><span class="keyword">func</span><span class="params">(obj <span class="keyword">interface</span>&#123;&#125;, isInInitialList <span class="type">bool</span>)</span></span> <span class="type">error</span> &#123;</span><br><span class="line"><span class="keyword">if</span> deltas, ok := obj.(Deltas); ok &#123;</span><br><span class="line"><span class="keyword">return</span> processDeltas(h, clientState, deltas, isInInitialList)</span><br><span class="line">&#125;</span><br><span class="line"><span class="keyword">return</span> errors.New(<span class="string">&quot;object given as Process argument is not Deltas&quot;</span>)</span><br><span class="line">&#125;,</span><br><span class="line">&#125;</span><br><span class="line"><span class="keyword">return</span> New(cfg)</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure><p>好，一个迷惑点来了，返回一个 <code>Controller</code>？<code>Controller</code>？傻傻分不清楚，这个 <code>Controller</code> 和我们常说的 <code>Controller</code> 组件完全不是一个东西，<strong>这个 <code>Controller</code> 是一个接口，其实它就是 <code>Informer</code> 哦</strong>。</p><p>关键来了，初始化的时候里面有一个 <code>DeltaFIFO</code> 的队列。有一个 <code>Process</code> 的处理方法，我们将 <code>h</code> 也就是我们外部的 <code>handler</code> 函数放进去了，这个 h 就是我们外部用户在使用 client-go 时申明的需要如何处理事件的方法。</p><h3 id="事件去了哪里"><a href="#事件去了哪里" class="headerlink" title="事件去了哪里"></a>事件去了哪里</h3><blockquote><p>源码阅读技巧：你不一定非要按事件的来龙去脉来看源码，有什么看什么，最后再串起来也是可以的</p></blockquote><p>由于我们现在看到了 <code>Process</code> 方法，知道这里是处理事件的，于是我们先看这个 <code>processDeltas</code>。</p><figure class="highlight go"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment">// staging/src/k8s.io/client-go/tools/cache/controller.go:436</span></span><br><span class="line"><span class="function"><span class="keyword">func</span> <span class="title">processDeltas</span><span class="params">(</span></span></span><br><span class="line"><span class="params"><span class="function">    handler ResourceEventHandler,</span></span></span><br><span class="line"><span class="params"><span class="function">    clientState Store,</span></span></span><br><span class="line"><span class="params"><span class="function">    deltas Deltas,</span></span></span><br><span class="line"><span class="params"><span class="function">    isInInitialList <span class="type">bool</span>,</span></span></span><br><span class="line"><span class="params"><span class="function">)</span></span> <span class="type">error</span> &#123;</span><br><span class="line">    <span class="keyword">for</span> _, d := <span class="keyword">range</span> deltas &#123;</span><br><span class="line">        obj := d.Object</span><br><span class="line"></span><br><span class="line">        <span class="keyword">switch</span> d.Type &#123;</span><br><span class="line">        <span class="keyword">case</span> Sync, Replaced, Added, Updated:</span><br><span class="line">            <span class="keyword">if</span> old, exists, err := clientState.Get(obj); err == <span class="literal">nil</span> &amp;&amp; exists &#123;</span><br><span class="line">                clientState.Update(obj)</span><br><span class="line">                handler.OnUpdate(old, obj)</span><br><span class="line">            &#125; <span class="keyword">else</span> &#123;</span><br><span class="line">                clientState.Add(obj)</span><br><span class="line">                handler.OnAdd(obj, isInInitialList)</span><br><span class="line">            &#125;</span><br><span class="line">        <span class="keyword">case</span> Deleted:</span><br><span class="line">            clientState.Delete(obj)</span><br><span class="line">            handler.OnDelete(obj)</span><br><span class="line">        &#125;</span><br><span class="line">    &#125;</span><br><span class="line">    <span class="keyword">return</span> <span class="literal">nil</span></span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure><p>删减了部分代码，不说一目了然，可以说是非常明确了。<strong>根据不同的事件类型，调用外部对于 handler 的方法处理事件就可以了。</strong></p><h3 id="事件从哪里来"><a href="#事件从哪里来" class="headerlink" title="事件从哪里来"></a>事件从哪里来</h3><p>那么问题就来了，<strong>这些要处理的事件是从哪里来的呢</strong>？于是我们看 <code>processDeltas</code> 方法的调用方 <code>Process</code> ，也就是从下往上找，是谁调用了 <code>Process</code> 方法呢？还好引用的地方不多，容易被找到关键在这里。</p><figure class="highlight go"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment">// staging/src/k8s.io/client-go/tools/cache/controller.go:186</span></span><br><span class="line"><span class="function"><span class="keyword">func</span> <span class="params">(c *controller)</span></span> processLoop() &#123;</span><br><span class="line"><span class="keyword">for</span> &#123;</span><br><span class="line">obj, err := c.config.Queue.Pop(PopProcessFunc(c.config.Process))</span><br><span class="line"><span class="keyword">if</span> err != <span class="literal">nil</span> &#123;</span><br><span class="line"><span class="keyword">if</span> err == ErrFIFOClosed &#123;</span><br><span class="line"><span class="keyword">return</span></span><br><span class="line">&#125;</span><br><span class="line"><span class="keyword">if</span> c.config.RetryOnError &#123;</span><br><span class="line"><span class="comment">// This is the safe way to re-enqueue.</span></span><br><span class="line">c.config.Queue.AddIfNotPresent(obj)</span><br><span class="line">&#125;</span><br><span class="line">&#125;</span><br><span class="line">&#125;</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure><p>不用多解释，这里 queue 就是我们之前在初始化看到的 <code>DeltaFIFO</code>，虽然我们不知道队列里面是做了什么，但无非这个循环的意思就是，<strong>从队列中不断 Pop 出事件，然后调用 Process 去处理这个事件</strong>。而我们此时可以顺变看一眼，<code>processLoop</code> 方法是在 controller(Informer) Run 也就是启动的时候被一起启动了(代码这里按下不表)。_当然这个循环中有一个重试的机制，如果遇到需要重试的任务，会重新放到队列里面去，一个小的不错设计_。</p><p>那么，只要我们知道是哪里在往这个队列里面塞数据，就知道事件从哪里来了。</p><p>好，下一个坑就出现了，由于我们是倒着看的，那么我想知道谁往队列里面塞数据，如果你想要看这个 queue 有多少地方在放数据，你会发现太多了，由于 DeltaFIFO 这个实现到处都在引用，所以这样看是很难找的。于是我们需要回到原理上来。看图说话，在最上面说 informer 的流程图的时候我们可以看到有一个 <code>Reflector</code> 的东西在放数据。于是乎我们应该去寻找的是这个东西，你好像在哪里看到过呢？没错 Run 的时候也就是在执行 processLoop 之前。</p><figure class="highlight go"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br><span class="line">29</span><br><span class="line">30</span><br><span class="line">31</span><br><span class="line">32</span><br><span class="line">33</span><br><span class="line">34</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment">// staging/src/k8s.io/client-go/tools/cache/controller.go:129</span></span><br><span class="line"><span class="function"><span class="keyword">func</span> <span class="params">(c *controller)</span></span> Run(stopCh &lt;-<span class="keyword">chan</span> <span class="keyword">struct</span>&#123;&#125;) &#123;</span><br><span class="line"><span class="keyword">defer</span> utilruntime.HandleCrash()</span><br><span class="line"><span class="keyword">go</span> <span class="function"><span class="keyword">func</span><span class="params">()</span></span> &#123;</span><br><span class="line">&lt;-stopCh</span><br><span class="line">c.config.Queue.Close()</span><br><span class="line">&#125;()</span><br><span class="line">r := NewReflectorWithOptions(</span><br><span class="line">c.config.ListerWatcher,</span><br><span class="line">c.config.ObjectType,</span><br><span class="line">c.config.Queue,</span><br><span class="line">ReflectorOptions&#123;</span><br><span class="line">ResyncPeriod:    c.config.FullResyncPeriod,</span><br><span class="line">TypeDescription: c.config.ObjectDescription,</span><br><span class="line">Clock:           c.clock,</span><br><span class="line">&#125;,</span><br><span class="line">)</span><br><span class="line">r.ShouldResync = c.config.ShouldResync</span><br><span class="line">r.WatchListPageSize = c.config.WatchListPageSize</span><br><span class="line"><span class="keyword">if</span> c.config.WatchErrorHandler != <span class="literal">nil</span> &#123;</span><br><span class="line">r.watchErrorHandler = c.config.WatchErrorHandler</span><br><span class="line">&#125;</span><br><span class="line"></span><br><span class="line">c.reflectorMutex.Lock()</span><br><span class="line">c.reflector = r</span><br><span class="line">c.reflectorMutex.Unlock()</span><br><span class="line"></span><br><span class="line"><span class="keyword">var</span> wg wait.Group</span><br><span class="line"></span><br><span class="line">wg.StartWithChannel(stopCh, r.Run)</span><br><span class="line"></span><br><span class="line">wait.Until(c.processLoop, time.Second, stopCh)</span><br><span class="line">wg.Wait()</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure><p>你可以发现，初始化 Reflector 的时候将 <code>c.config.Queue</code> 放进去了作为它的 <code>store</code>，那么关键就在这里面了。之后的链路是： <code>r.Run</code> -&gt; <code>r.ListAndWatch</code> -&gt; <code>r.watch</code> -&gt; <code>watchHandler</code>， 好家伙， 链路还有点长的。好在代码并不复杂。在 <code>watchHandler</code> 中有如下精髓：</p><figure class="highlight go"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment">// staging/src/k8s.io/client-go/tools/cache/reflector.go:743</span></span><br><span class="line">resourceVersion := meta.GetResourceVersion()</span><br><span class="line"><span class="keyword">switch</span> event.Type &#123;</span><br><span class="line"><span class="keyword">case</span> watch.Added:</span><br><span class="line">    err := store.Add(event.Object)</span><br><span class="line"><span class="keyword">case</span> watch.Modified:</span><br><span class="line">    err := store.Update(event.Object)</span><br><span class="line"><span class="keyword">case</span> watch.Deleted:</span><br><span class="line">    err := store.Delete(event.Object)</span><br><span class="line"><span class="keyword">case</span> watch.Bookmark:</span><br><span class="line">    <span class="keyword">if</span> _, ok := meta.GetAnnotations()[<span class="string">&quot;k8s.io/initial-events-end&quot;</span>]; ok &#123;</span><br><span class="line">        <span class="keyword">if</span> exitOnInitialEventsEndBookmark != <span class="literal">nil</span> &#123;</span><br><span class="line">            *exitOnInitialEventsEndBookmark = <span class="literal">true</span></span><br><span class="line">        &#125;</span><br><span class="line">    &#125;</span><br><span class="line"><span class="keyword">default</span>:</span><br><span class="line">    utilruntime.HandleError(fmt.Errorf(<span class="string">&quot;%s: unable to understand watch event %#v&quot;</span>, name, event))</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure><p>破案了，<code>store.Add</code> 明白了，原来你在这里。这里我们就可以串起来了，图也就非常明白了，代码也疏通了。</p><h3 id="总结一下"><a href="#总结一下" class="headerlink" title="总结一下"></a>总结一下</h3><ol><li><code>Reflector</code> 监听资源变动，将变动放到队列 <code>DeltaFIFO</code> 中</li><li><code>Informer</code> 不停的从队列中拿取，并调用外部的 <code>handler</code> 进行处理</li><li>至于外部怎么处理的，那是外部的事情了</li></ol><h2 id="码后解答"><a href="#码后解答" class="headerlink" title="码后解答"></a>码后解答</h2><ol><li>informer 有那几个组件<ol><li>关键组件是 Reflector 和 DeltaFIFO</li></ol></li><li>informer 机制是怎么样的<ol><li>监听、同步、事件处理(外部)、重试</li></ol></li><li>为什么需要 informer<ol><li>这是一个小设计了，我们在下面总结提升详细说</li></ol></li></ol><h2 id="总结提升"><a href="#总结提升" class="headerlink" title="总结提升"></a>总结提升</h2><h3 id="设计上"><a href="#设计上" class="headerlink" title="设计上"></a>设计上</h3><p>就像我们一开始说的那样，想要控制一个对象，你需要先知道对象的状态。那么第一个设计的优点就来了：<strong>与其不停的去通过 API 查询对象的状态，不如你自己主动去监听状态的变化</strong>。这是一种事件机制的设计，在很多地方都会用到。而主要的原因是查询的无效次数过多，而且 API 的压力又大。</p><p>而在 k8s 中太多需要监控对象的地方了，如果无论是谁来都要写一遍监控的代码，并且还要处理各种事件的解析、队列、重试..太麻烦了，于是 k8s 将其抽象为 Informer 的机制。从外部你只需要关注如何 handle 事件就可以了，从代码看有一种函数闭包的思想在里面。而且哦，关键在于 DeltaFIFO 还有各种细节的优化。</p><h3 id="编码上"><a href="#编码上" class="headerlink" title="编码上"></a>编码上</h3><p><code>DeltaFIFO</code> 真的是一个很不错设计，是值得我们去学习，并且在其他项目中可以直接拿来抄的一种优化方案。</p><ol><li>go 里面利用 sync.cond 的不多见，这个队列算是一个经典案例。Pop 的时候没有元素的时候会 <code>Wait</code> 等着，有元素来了会 <code>Broadcast</code> 通知，节省资源</li><li>在 <code>dedupDeltas</code> 里面会 <code>This will combine the most recent two deltas if they are the same.</code> 也就所谓的压缩事件，也就 “聚合”，这是在给消费端减负，相同的事件你只需处理一次就好了，相当于这里就帮你过滤了一次，好贴心。</li></ol><h3 id="总结"><a href="#总结" class="headerlink" title="总结"></a>总结</h3><p>从原理上来说 informer 本身不复杂，而且真的是一个不错的设计，从我的感受上来总结可以用两个词 <strong>事件+解耦</strong> ，一个事件通知机制加上一个抽象解耦的实现。希望你能体会到，之后我们会用到它。当然这里介绍的是 informer 本身，它的前后都还有好多小助手哦。</p>]]>
    </content>
    
    
      
      
        
        
    <summary type="html">&lt;blockquote&gt;
&lt;p&gt;📢 注意，该文本非最终版本，正在更新中，版权所有，请勿转载！！&lt;/p&gt;
&lt;/blockquote&gt;
&lt;h2 id=&quot;前言&quot;&gt;&lt;a href=&quot;#前言&quot; class=&quot;headerlink&quot;</summary>
        
      
    
    
    
    <category term="kubernetes" scheme="https://www.linkinstars.com/categories/kubernetes/"/>
    
    
    <category term="kubernetes" scheme="https://www.linkinstars.com/tags/kubernetes/"/>
    
  </entry>
  
  <entry>
    <title>《一起读 kubernetes 源码》pause 你在哪里？</title>
    <link href="https://www.linkinstars.com/post/5b37323a.html"/>
    <id>https://www.linkinstars.com/post/5b37323a.html</id>
    <published>2024-04-12T16:00:00.000Z</published>
    <updated>2024-04-12T07:44:54.000Z</updated>
    
    <content type="html">
    <![CDATA[<a href="https://www.linkinstars.com/post/5b37323a.html">RSS 阅读体验可能不太好，若喜欢本文请点此跳转原文查看~</a><br><br>]]>
    <![CDATA[<blockquote><p>📢 注意，该文本非最终版本，正在更新中，版权所有，请勿转载！！</p></blockquote><h2 id="前言"><a href="#前言" class="headerlink" title="前言"></a>前言</h2><p>你有没有在 k8s 的 node 上敲过 <code>docker ps</code> 这个命令，我就干过。而出现的结果大概会是这样的：</p><figure class="highlight bash"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br></pre></td><td class="code"><pre><span class="line">root@10.0.10.102:~# docker ps</span><br><span class="line">CONTAINER ID   IMAGE                           COMMAND                  CREATED      STATUS      PORTS     NAMES</span><br><span class="line">5aa88e8d16ac   xxxx                            <span class="string">&quot;/entrypoint.sh&quot;</span>         3 days ago   Up 3 days             k8s_xxx-0</span><br><span class="line">4e40566baa09   google_containers/pause:3.4.1   <span class="string">&quot;/pause&quot;</span>                 3 days ago   Up 3 days             k8s_POD_xxxx</span><br></pre></td></tr></table></figure><p>你有没有好奇过这个 <code>google_containers/pause</code> 是什么来路？为什么会有一个这个容器，并且和应用总是成对出现的？我就好奇，于是今天就来叭叭一下 pause 是做什么的。<br>最早以前 pause 在一些教程里面叫作 infra，我也是当时受众之一，所以第一次看到 pause 有点奇怪它与 infra 的关系，其实是一个东西。</p><h2 id="前置知识"><a href="#前置知识" class="headerlink" title="前置知识"></a>前置知识</h2><ul><li>Linux namespace</li><li>pod</li><li>cri</li></ul><h2 id="码前提问"><a href="#码前提问" class="headerlink" title="码前提问"></a>码前提问</h2><ol><li>pause 什么时候被创建的？</li><li>pause 是谁创建的？</li><li>pause 的作用是什么？</li></ol><h2 id="心路历程"><a href="#心路历程" class="headerlink" title="心路历程"></a>心路历程</h2><p>作为第一章节的最后一小结，将在这里说明另一个源码阅读要注意的方式方法：<strong>先原理，再源码</strong>。有时候，仅仅只是使用某个工具或项目，一些细节的地方是没有办法在使用中被了解的，比如我们使用了很久 k8s 知道了 pod 的作用以及能力，但我们依旧对 pause 毫无感知，因为它是那种背后默默无闻的东西。对于这些技术的实现，如果直接去看源码会有两个问题，一个是难以理解，另一个则是容易误入歧途，看着看着看叉了。所以，对于 pause 与之前不同的是，我们需要先去弄懂它的原理，了解了大概之后再回去看源码。</p><blockquote><p>如果不了解的请看 <a href="https://www.ianlewis.org/en/almighty-pause-container">https://www.ianlewis.org/en/almighty-pause-container</a></p></blockquote><p>当然，你也可以不看，我直接帮你总结为了一句话：<strong>paues 让 pod 中的多个容器可以 sharing namespaces(共享命名空间)。</strong><br>因为我们知道一个 pod 可以包含多个容器，这些容器可以共享网络资源，并且重要的是 namespace 是隔离的基础，也是运行的保证，如果让任意其他的业务容器去当作主容器被别人共享，那么主容器的安危就决定了整个 pod 的生死，那显然有些不合理，于是找到了中间商 pause 来帮助我们先 hold 所需要的 namespace，然后做共享，这也就是 pause 存在的意义了。<br>你可以根据文中的指令来在本机上运行一个 pause 容器来使用 <code>--net=container:pause</code> 类似的参数来共享，并测试。</p><p>而 pause 在 k8s 中是如何被创建，并且做了哪些事情呢？这就需要到源码中寻找答案了。</p><h2 id="源码分析"><a href="#源码分析" class="headerlink" title="源码分析"></a>源码分析</h2><p>当你想要你 k8s 的源码中寻找 pause 的时候，你就会发现，你能找到一些蛛丝马迹，但是毫无头绪，一开始我也是的，我在源码中搜索了所有有关 pause 的内容，发现并没有看到真正创建这个容器的地方。（此时我还没懂 pause 的原理）于是乎，我回头弄清楚的原理（先原理再源码），发现 pause 的作用是共享命名空间，<strong>那么它的创建一定是在 pod 创建的比较前面步骤，至少要在其他容器创建之前</strong>。</p><p>于是就回到了我们第一节里面，说 pod 创建的时候有一个 SyncPod 的方法</p><figure class="highlight go"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment">// SyncPod syncs the running pod into the desired pod by executing following steps://</span></span><br><span class="line"><span class="comment">//  1. Compute sandbox and container changes.</span></span><br><span class="line"><span class="comment">//  2. Kill pod sandbox if necessary.</span></span><br><span class="line"><span class="comment">//  3. Kill any containers that should not be running.</span></span><br><span class="line"><span class="comment">//  4. Create sandbox if necessary.</span></span><br><span class="line"><span class="comment">//  5. Create ephemeral containers.</span></span><br><span class="line"><span class="comment">//  6. Create init containers.</span></span><br><span class="line"><span class="comment">//  7. Resize running containers (if InPlacePodVerticalScaling==true)</span></span><br><span class="line"><span class="comment">//  8. Create normal containers.</span></span><br><span class="line"><span class="function"><span class="keyword">func</span> <span class="params">(m *kubeGenericRuntimeManager)</span></span> SyncPod(</span><br></pre></td></tr></table></figure><p>我就发现当时有一个 sandbox 容器我们没有管它，难道是它？于是我带着目标去追源码 <code>createPodSandbox</code> 这个方法就是在 <code>SyncPod</code> 里面的第 4 步骤：</p><figure class="highlight go"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment">// pkg/kubelet/kuberuntime/kuberuntime_sandbox.go:40</span></span><br><span class="line"><span class="comment">// createPodSandbox creates a pod sandbox and returns (podSandBoxID, message, error).</span></span><br><span class="line"><span class="function"><span class="keyword">func</span> <span class="params">(m *kubeGenericRuntimeManager)</span></span> createPodSandbox(ctx context.Context, pod *v1.Pod, attempt <span class="type">uint32</span>) (<span class="type">string</span>, <span class="type">string</span>, <span class="type">error</span>) &#123;</span><br><span class="line"><span class="comment">// ...</span></span><br><span class="line">podSandboxConfig, err := m.generatePodSandboxConfig(pod, attempt)</span><br><span class="line"></span><br><span class="line"><span class="comment">// ...</span></span><br><span class="line">err = m.osInterface.MkdirAll(podSandboxConfig.LogDirectory, <span class="number">0755</span>)</span><br><span class="line"></span><br><span class="line"><span class="comment">// ...</span></span><br><span class="line">runtimeHandler, err = m.runtimeClassManager.LookupRuntimeHandler(pod.Spec.RuntimeClassName)</span><br><span class="line"></span><br><span class="line"><span class="comment">// ...</span></span><br><span class="line">podSandBoxID, err := m.runtimeService.RunPodSandbox(ctx, podSandboxConfig, runtimeHandler)</span><br><span class="line"></span><br><span class="line"><span class="keyword">return</span> podSandBoxID, <span class="string">&quot;&quot;</span>, <span class="literal">nil</span></span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure><p>其中就是创建了 <code>podSandboxConfig</code> 然后就是 <code>RunPodSandbox</code> 也就是使用必要的配置去启动 Sandbox，接下来要注意，别跟错了</p><figure class="highlight go"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment">// pkg/kubelet/cri/remote/remote_runtime.go:176</span></span><br><span class="line"><span class="function"><span class="keyword">func</span> <span class="params">(r *remoteRuntimeService)</span></span> RunPodSandbox(ctx context.Context, config *runtimeapi.PodSandboxConfig, runtimeHandler <span class="type">string</span>) (<span class="type">string</span>, <span class="type">error</span>) &#123;</span><br><span class="line"><span class="comment">// ...</span></span><br><span class="line">resp, err := r.runtimeClient.RunPodSandbox(ctx, &amp;runtimeapi.RunPodSandboxRequest&#123;</span><br><span class="line">Config:         config,</span><br><span class="line">RuntimeHandler: runtimeHandler,</span><br><span class="line">&#125;)</span><br><span class="line"></span><br><span class="line"><span class="comment">// ...</span></span><br><span class="line">podSandboxID := resp.PodSandboxId</span><br><span class="line"><span class="keyword">return</span> podSandboxID, <span class="literal">nil</span></span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure><p>最后终于到了关键了 runtimeClient 调用的 RunPodSandbox</p><figure class="highlight go"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment">// kubernetes/vendor/k8s.io/cri-api/pkg/apis/runtime/v1/api.pb.go</span></span><br><span class="line"><span class="function"><span class="keyword">func</span> <span class="params">(c *runtimeServiceClient)</span></span> RunPodSandbox(ctx context.Context, in *RunPodSandboxRequest, opts ...grpc.CallOption) (*RunPodSandboxResponse, <span class="type">error</span>) &#123;</span><br><span class="line">out := <span class="built_in">new</span>(RunPodSandboxResponse)</span><br><span class="line">err := c.cc.Invoke(ctx, <span class="string">&quot;/runtime.v1.RuntimeService/RunPodSandbox&quot;</span>, in, out, opts...)</span><br><span class="line"><span class="keyword">if</span> err != <span class="literal">nil</span> &#123;</span><br><span class="line"><span class="keyword">return</span> <span class="literal">nil</span>, err</span><br><span class="line">&#125;</span><br><span class="line"><span class="keyword">return</span> out, <span class="literal">nil</span></span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure><p>到此，如果你不知道原理，你肯定就懵了。哈？怎么到了一个 pb 里面，并且一个 Invoke 就结束了？此时源码已经追不下去了。这也是读源码最容易遇到的一个问题，由于源码本身会依赖外部的一些实现，导致阅读源码本身并不能理解全部，此时也是原理发挥作用的时候了。让我们来仔细分析一下：</p><ol><li>这个是在一个叫 cri-api 的包下面</li><li>pb 是 Protocol Buffer 也就是 grpc 的一个调用</li></ol><p>所以：得到结论这一定是在调用一个 CRI 的接口，也就是有其他人在实现这个接口，kubelet 负责调用。OK，这里我就不讨论 dockershim 和 containerd 的关系，让我们先来直接看看 containerd 对于 CRI 的实现吧。不要怕，让我们去 containerd 的源码里面看看。</p><h3 id="原来是你-containerd"><a href="#原来是你-containerd" class="headerlink" title="原来是你 containerd"></a>原来是你 containerd</h3><p>于是我直接去 containerd 源码里面搜索 <code>RuntimeService</code> 的 <code>RunPodSandbox</code> 实现。</p><p><a href="https://github.com/containerd/containerd/blob/b693d137ed5f905d04bf955b185054011e25880c/internal/cri/server/sandbox_run.go#L51">https://github.com/containerd/containerd/blob/b693d137ed5f905d04bf955b185054011e25880c/internal/cri/server/sandbox_run.go#L51</a></p><figure class="highlight go"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br></pre></td><td class="code"><pre><span class="line"><span class="comment">// RunPodSandbox creates and starts a pod-level sandbox. Runtimes should ensure</span></span><br><span class="line"><span class="comment">// the sandbox is in ready state.</span></span><br><span class="line"><span class="function"><span class="keyword">func</span> <span class="params">(c *criService)</span></span> RunPodSandbox(ctx context.Context, r *runtime.RunPodSandboxRequest) (_ *runtime.RunPodSandboxResponse, retErr <span class="type">error</span>) &#123;</span><br><span class="line"><span class="comment">// ...</span></span><br><span class="line"><span class="keyword">if</span> err := c.sandboxService.CreateSandbox(ctx, sandboxInfo, sb.WithOptions(config), sb.WithNetNSPath(sandbox.NetNSPath)); err != <span class="literal">nil</span> &#123;</span><br><span class="line"><span class="keyword">return</span> <span class="literal">nil</span>, fmt.Errorf(<span class="string">&quot;failed to create sandbox %q: %w&quot;</span>, id, err)</span><br><span class="line">&#125;</span><br><span class="line"><span class="comment">// ...</span></span><br><span class="line">ctrl, err := c.sandboxService.StartSandbox(ctx, sandbox.Sandboxer, id)</span><br><span class="line">&#125;</span><br></pre></td></tr></table></figure><p><code>CreateSandbox</code> 创建，嗯。<code>StartSandbox</code> 启动，嗯。然后我就找，那镜像是哪个，于是让我发现了一个常量</p><p><a href="https://github.com/containerd/containerd/blob/2adae6093e52028580f72c6f8c4f2f06c9d57648/internal/cri/config/config.go#L73">https://github.com/containerd/containerd/blob/2adae6093e52028580f72c6f8c4f2f06c9d57648/internal/cri/config/config.go#L73</a></p><figure class="highlight go"><table><tr><td class="gutter"><pre><span class="line">1</span><br></pre></td><td class="code"><pre><span class="line">DefaultSandboxImage = <span class="string">&quot;registry.k8s.io/pause:3.9&quot;</span></span><br></pre></td></tr></table></figure><p>好家伙，还得是你啊。目前我们就知道了是谁创建的这个 pause 容器，那么这个容器是干嘛的呢？于是乎，我去找找这个容器的镜像是如何构建的，让我们回到 k8s 源码里面看看。</p><h3 id="pause-镜像"><a href="#pause-镜像" class="headerlink" title="pause 镜像"></a>pause 镜像</h3><p>dockerfile 在 <code>kubernetes/build/pause/Dockerfile</code>，非常容易，就是启动一个二进制 <code>/pause</code></p><figure class="highlight plaintext"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br></pre></td><td class="code"><pre><span class="line">ARG BASE</span><br><span class="line">FROM $&#123;BASE&#125;</span><br><span class="line">ARG ARCH</span><br><span class="line">ADD bin/pause-linux-$&#123;ARCH&#125; /pause</span><br><span class="line">USER 65535:65535</span><br><span class="line">ENTRYPOINT [&quot;/pause&quot;]</span><br></pre></td></tr></table></figure><p>这个二进制的源码在 <code>kubernetes/build/pause/linux/pause.c</code></p><figure class="highlight cpp"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br><span class="line">29</span><br><span class="line">30</span><br><span class="line">31</span><br><span class="line">32</span><br><span class="line">33</span><br><span class="line">34</span><br><span class="line">35</span><br><span class="line">36</span><br><span class="line">37</span><br><span class="line">38</span><br></pre></td><td class="code"><pre><span class="line"><span class="function"><span class="type">static</span> <span class="type">void</span> <span class="title">sigdown</span><span class="params">(<span class="type">int</span> signo)</span> </span>&#123;</span><br><span class="line">  <span class="built_in">psignal</span>(signo, <span class="string">&quot;Shutting down, got signal&quot;</span>);</span><br><span class="line">  <span class="built_in">exit</span>(<span class="number">0</span>);</span><br><span class="line">&#125;</span><br><span class="line"></span><br><span class="line"><span class="function"><span class="type">static</span> <span class="type">void</span> <span class="title">sigreap</span><span class="params">(<span class="type">int</span> signo)</span> </span>&#123;</span><br><span class="line">  <span class="keyword">while</span> (<span class="built_in">waitpid</span>(<span class="number">-1</span>, <span class="literal">NULL</span>, WNOHANG) &gt; <span class="number">0</span>)</span><br><span class="line">    ;</span><br><span class="line">&#125;</span><br><span class="line"></span><br><span class="line"><span class="function"><span class="type">int</span> <span class="title">main</span><span class="params">(<span class="type">int</span> argc, <span class="type">char</span> **argv)</span> </span>&#123;</span><br><span class="line">  <span class="type">int</span> i;</span><br><span class="line">  <span class="keyword">for</span> (i = <span class="number">1</span>; i &lt; argc; ++i) &#123;</span><br><span class="line">    <span class="keyword">if</span> (!<span class="built_in">strcasecmp</span>(argv[i], <span class="string">&quot;-v&quot;</span>)) &#123;</span><br><span class="line">      <span class="built_in">printf</span>(<span class="string">&quot;pause.c %s\n&quot;</span>, <span class="built_in">VERSION_STRING</span>(VERSION));</span><br><span class="line">      <span class="keyword">return</span> <span class="number">0</span>;</span><br><span class="line">    &#125;</span><br><span class="line">  &#125;</span><br><span class="line"></span><br><span class="line">  <span class="keyword">if</span> (<span class="built_in">getpid</span>() != <span class="number">1</span>)</span><br><span class="line">    <span class="comment">/* Not an error because pause sees use outside of infra containers. */</span></span><br><span class="line">    <span class="built_in">fprintf</span>(stderr, <span class="string">&quot;Warning: pause should be the first process\n&quot;</span>);</span><br><span class="line"></span><br><span class="line">  <span class="keyword">if</span> (<span class="built_in">sigaction</span>(SIGINT, &amp;(<span class="keyword">struct</span> sigaction)&#123;.sa_handler = sigdown&#125;, <span class="literal">NULL</span>) &lt; <span class="number">0</span>)</span><br><span class="line">    <span class="keyword">return</span> <span class="number">1</span>;</span><br><span class="line">  <span class="keyword">if</span> (<span class="built_in">sigaction</span>(SIGTERM, &amp;(<span class="keyword">struct</span> sigaction)&#123;.sa_handler = sigdown&#125;, <span class="literal">NULL</span>) &lt; <span class="number">0</span>)</span><br><span class="line">    <span class="keyword">return</span> <span class="number">2</span>;</span><br><span class="line">  <span class="keyword">if</span> (<span class="built_in">sigaction</span>(SIGCHLD, &amp;(<span class="keyword">struct</span> sigaction)&#123;.sa_handler = sigreap,</span><br><span class="line">                                             .sa_flags = SA_NOCLDSTOP&#125;,</span><br><span class="line">                <span class="literal">NULL</span>) &lt; <span class="number">0</span>)</span><br><span class="line">    <span class="keyword">return</span> <span class="number">3</span>;</span><br><span class="line"></span><br><span class="line">  <span class="keyword">for</span> (;;)</span><br><span class="line">    <span class="built_in">pause</span>();</span><br><span class="line">  <span class="built_in">fprintf</span>(stderr, <span class="string">&quot;Error: infinite loop terminated\n&quot;</span>);</span><br><span class="line">  <span class="keyword">return</span> <span class="number">42</span>;</span><br><span class="line">&#125;</span><br><span class="line"></span><br></pre></td></tr></table></figure><p>就这？没错，这就是全部了。里面做了什么事情呢？</p><ol><li>如果有 -v 打印版本号</li><li>看看自己是不是第一个进程 pid 是不是 1</li><li>处理 SIGINT、SIGTERM、SIGCHLD 三个信号</li><li>死循环等着吧</li></ol><p>其实也不过如此是吧，当这个容器创建之后，就如同最开始说的，比如 docker 就可以通过 <code>--net=container:pause</code> 共享你需要的 namespace 了。</p><h2 id="码后解答"><a href="#码后解答" class="headerlink" title="码后解答"></a>码后解答</h2><ol><li>pause 什么时候被创建的？<ol><li>pod 创建的第一个步骤被创建的</li></ol></li><li>pause 是谁创建的？<ol><li>CRI 的实现者，可以是 containerd、docker</li></ol></li><li>pause 的作用是什么？<ol><li>成为 pid 为 1 也就是第一个进程从而 “hold 住” namespace</li></ol></li></ol><h2 id="总结提升"><a href="#总结提升" class="headerlink" title="总结提升"></a>总结提升</h2><p>pause 作为 pod 创建的最后一块拼图，已经拼上了，至此我觉得 pod 本身的原理应该已经明确了。这一节的代码不复杂，主要是想让你明白，有时候需要明确里面的设计原理和思路再去看代码，否则很容易看不懂或者掉入怪圈里面。在遇到一些外部调用和扩展的时候也不用慌张，努力去发现一些蛛丝马迹，结合已有的知识点大胆假设，小心求证，你总能在源码中找到属于你的真相。</p>]]>
    </content>
    
    
      
      
        
        
    <summary type="html">&lt;blockquote&gt;
&lt;p&gt;📢 注意，该文本非最终版本，正在更新中，版权所有，请勿转载！！&lt;/p&gt;
&lt;/blockquote&gt;
&lt;h2 id=&quot;前言&quot;&gt;&lt;a href=&quot;#前言&quot; class=&quot;headerlink&quot;</summary>
        
      
    
    
    
    <category term="kubernetes" scheme="https://www.linkinstars.com/categories/kubernetes/"/>
    
    
    <category term="kubernetes" scheme="https://www.linkinstars.com/tags/kubernetes/"/>
    
  </entry>
  
  <entry>
    <title>假如 Redis 里面有 1 亿个 key，其中有 10w 个 key 是以某个固定的已知的前缀开头的，如何将它们全部找出来？</title>
    <link href="https://www.linkinstars.com/post/c09dd4e8.html"/>
    <id>https://www.linkinstars.com/post/c09dd4e8.html</id>
    <published>2024-03-31T16:00:00.000Z</published>
    <updated>2024-07-03T04:05:23.593Z</updated>
    
    <content type="html">
    <![CDATA[<a href="https://www.linkinstars.com/post/c09dd4e8.html">RSS 阅读体验可能不太好，若喜欢本文请点此跳转原文查看~</a><br><br>]]>
    <![CDATA[<blockquote><p>这个问题本身不难，但网上的教程答案让我很不理解，所以单独拿来吐槽一下</p></blockquote><h2 id="来源与网络的答案"><a href="#来源与网络的答案" class="headerlink" title="来源与网络的答案"></a>来源与网络的答案</h2><p><img src="https://blog.linkinstars.com/blog/redis-scan-command-benchmark.png" alt="redis-scan-command-benchmark"></p><p>我特意用了截图而不是贴链接。其中“如何”还打成了如果…</p><h3 id="有什么问题？"><a href="#有什么问题？" class="headerlink" title="有什么问题？"></a>有什么问题？</h3><p>如果我是面试官，问了这个问题，如果你第一回答是 keys，那么恭喜你可以回去等通知了（言重了，说白了就不往下问了）</p><p>1 亿个，你知道什么概念吗？如果直接 keys 一下线上的数据不知道要阻塞多久，你下面的回答明明就知道答案偏偏把人家往沟里带…</p><p><strong>但如果就只是如此，我也不用写这篇博客了，我想说的是 SCAN 也不是最优解</strong></p><h2 id="SCAN-有什么问题"><a href="#SCAN-有什么问题" class="headerlink" title="SCAN 有什么问题"></a>SCAN 有什么问题</h2><p>不卡，但是慢，下面是来源与网络的一个测试结果，<a href="https://www.cnblogs.com/jinanxiaolaohu/p/17302734.html">Redis 性能问题诊断以及 scan 命令耗时分析</a></p><figure class="highlight yaml"><table><tr><td class="gutter"><pre><span class="line">1</span><br><span class="line">2</span><br><span class="line">3</span><br><span class="line">4</span><br><span class="line">5</span><br><span class="line">6</span><br><span class="line">7</span><br><span class="line">8</span><br><span class="line">9</span><br><span class="line">10</span><br><span class="line">11</span><br><span class="line">12</span><br><span class="line">13</span><br><span class="line">14</span><br><span class="line">15</span><br><span class="line">16</span><br><span class="line">17</span><br><span class="line">18</span><br><span class="line">19</span><br><span class="line">20</span><br><span class="line">21</span><br><span class="line">22</span><br><span class="line">23</span><br><span class="line">24</span><br><span class="line">25</span><br><span class="line">26</span><br><span class="line">27</span><br><span class="line">28</span><br><span class="line">29</span><br><span class="line">30</span><br><span class="line">31</span><br></pre></td><td class="code"><pre><span class="line"><span class="string">测试命令:</span></span><br><span class="line"><span class="string">./redis-benchmark</span> <span class="string">-a</span> <span class="string">xxxx</span>  <span class="string">-r</span> <span class="number">10000</span> <span class="string">-n</span> <span class="number">100</span> <span class="string">-c</span> <span class="number">8000 </span><span class="string">scan</span> <span class="number">0</span> <span class="string">match</span> <span class="string">zhaobsh*</span> <span class="string">count</span>  <span class="number">10000</span></span><br><span class="line"><span class="number">10000</span><span class="string">个随机key,</span> <span class="string">测试100次,</span> <span class="string">使用</span> <span class="number">80000</span><span class="string">个client进行测试验证.</span></span><br><span class="line"><span class="string">被测试的命令为:</span> <span class="string">scan</span> <span class="number">0</span> <span class="string">match</span> <span class="string">zhaobsh*</span> <span class="string">count</span>  <span class="number">10000</span></span><br><span class="line"></span><br><span class="line"><span class="string">一万个key时count</span> <span class="string">一万时:</span></span><br><span class="line"><span class="attr">Summary:</span></span><br><span class="line">  <span class="attr">throughput summary:</span> <span class="number">99.11</span> <span class="string">requests</span> <span class="string">per</span> <span class="string">second</span></span><br><span class="line">  <span class="attr">latency summary (msec):</span></span><br><span class="line">          <span class="string">avg</span>       <span class="string">min</span>       <span class="string">p50</span>       <span class="string">p95</span>       <span class="string">p99</span>       <span class="string">max</span></span><br><span class="line">      <span class="number">993.304</span>    <span class="number">17.472</span>  <span class="number">1004.543  </span><span class="number">1005.055  </span><span class="number">1005.055  </span><span class="number">1005.055</span></span><br><span class="line"></span><br><span class="line"><span class="string">十万个key时</span> <span class="string">count</span> <span class="string">十万时</span></span><br><span class="line"><span class="attr">Summary:</span></span><br><span class="line">  <span class="attr">throughput summary:</span> <span class="number">11.54</span> <span class="string">requests</span> <span class="string">per</span> <span class="string">second</span></span><br><span class="line">  <span class="attr">latency summary (msec):</span></span><br><span class="line">          <span class="string">avg</span>       <span class="string">min</span>       <span class="string">p50</span>       <span class="string">p95</span>       <span class="string">p99</span>       <span class="string">max</span></span><br><span class="line">     <span class="number">2970.660   </span><span class="number">135.680</span>  <span class="number">3000.319  </span><span class="number">3000.319  </span><span class="number">3000.319  </span><span class="number">3000.319</span></span><br><span class="line"><span class="string">五十万个key时</span> <span class="string">count</span> <span class="string">五十万时</span></span><br><span class="line"><span class="attr">Summary:</span></span><br><span class="line">  <span class="attr">throughput summary:</span> <span class="number">3.46</span> <span class="string">requests</span> <span class="string">per</span> <span class="string">second</span></span><br><span class="line">  <span class="attr">latency summary (msec):</span></span><br><span class="line">          <span class="string">avg</span>       <span class="string">min</span>       <span class="string">p50</span>       <span class="string">p95</span>       <span class="string">p99</span>       <span class="string">max</span></span><br><span class="line">     <span class="number">2972.532   </span><span class="number">322.816</span>  <span class="number">3000.319  </span><span class="number">3000.319  </span><span class="number">3000.319  </span><span class="number">3000.319</span></span><br><span class="line"></span><br><span class="line"><span class="string">自己进行了一下验证,</span> <span class="string">如果直接一次scan</span> <span class="string">一千万的记录</span></span><br><span class="line"><span class="string">耗时为:</span> <span class="number">10.15</span><span class="string">秒.</span></span><br><span class="line"><span class="string">理论上scan</span> <span class="string">一个键值对的时间为</span> <span class="number">1</span><span class="string">微秒左右.</span></span><br><span class="line"></span><br><span class="line"><span class="string">如果redis里面有</span> <span class="number">1000</span><span class="string">万个key的话</span>  <span class="number">60</span><span class="string">台服务器如果同时进行一次所有的scan</span></span><br><span class="line"><span class="string">那么搞不好至少会有在</span> <span class="string">运行期间内产生总计</span> <span class="string">600S</span> <span class="string">的延迟时间.</span></span><br></pre></td></tr></table></figure><p>显然，多少还是会给线上系统有影响的，可以有个毛刺？具体就看实际情况了。</p><p>那么我们来分析一下问题，如果面试官只是要考查你 KEYS 命令和 SCAN 命令的区别，并且想要看看你知不知道 KEYS 命令的阻塞问题，那么你回答 SCAN 就已经过了。</p><p>而实际中，如果真的有经验，你就会发现 SCAN 的能力阈值是在那里的。于是你需要继续反问面试官，是否有时间要求。</p><h3 id="实际业务"><a href="#实际业务" class="headerlink" title="实际业务"></a>实际业务</h3><p>在实际业务中，我能想到的场景有两个：</p><ol><li>明知山有虎：就是你本身就有这样的业务场景需要去做所有当前 key 的统一操作，那么以空间换时间，提前以其他数据结构存储你需要的 key 才合理。比如，现在想要让 user 的 key 全部过期，至少我不会去考虑使用 scan 遍历出来然后再进行处理。</li><li>意外的统计：我现在突然有一个统计的需求，但统计的数据只有缓存里面有。那么没办法，甲方最大，SCAN 就 SCAN 吧。但要注意 SCAN 重复性的问题。</li></ol><h2 id="总结"><a href="#总结" class="headerlink" title="总结"></a>总结</h2><p>我其实想说的是，作为线上的数据和操作，你的每次操作都需要明确可能会带来的后果是什么，并不是简单的别人说 SCAN 就 SCAN 了，你需要清楚的了解可能的后果，你才有底气去操作。同时也会让问你的人清楚，你是有过经验的人，而非纸上谈兵。</p>]]>
    </content>
    
    
      
      
        
        
    <summary type="html">&lt;blockquote&gt;
&lt;p&gt;这个问题本身不难，但网上的教程答案让我很不理解，所以单独拿来吐槽一下&lt;/p&gt;
&lt;/blockquote&gt;
&lt;h2 id=&quot;来源与网络的答案&quot;&gt;&lt;a href=&quot;#来源与网络的答案&quot; class=&quot;headerlink&quot;</summary>
        
      
    
    
    
    <category term="redis" scheme="https://www.linkinstars.com/categories/redis/"/>
    
    
    <category term="redis" scheme="https://www.linkinstars.com/tags/redis/"/>
    
  </entry>
  
  <entry>
    <title>不定期刊</title>
    <link href="https://www.linkinstars.com/post/22fd52ad.html"/>
    <id>https://www.linkinstars.com/post/22fd52ad.html</id>
    <published>2024-03-30T16:00:00.000Z</published>
    <updated>2025-05-30T16:00:00.000Z</updated>
    
    <content type="html">
    <![CDATA[<a href="https://www.linkinstars.com/post/22fd52ad.html">RSS 阅读体验可能不太好，若喜欢本文请点此跳转原文查看~</a><br><br>]]>
    <![CDATA[<h1 id="点我跳转查看-Notion-Paper"><a href="#点我跳转查看-Notion-Paper" class="headerlink" title="点我跳转查看 Notion Paper"></a><a href="https://space.linkinstars.com/notion-paper/home">点我跳转查看 Notion Paper</a></h1><h2 id="写在前面"><a href="#写在前面" class="headerlink" title="写在前面"></a>写在前面</h2><p>之前在博客装修的时候就提到过，对于之前的设计专栏确实不太适合我，写了一段时间之后发现太难了，所以就一直没有更新。但是我还是想要有一个专门的地方来记录一些自己的想法，所以就有了这个 Notion Paper。</p><p>这里会记录我日常看到的一些设计、产品、技术等等的一些想法。因为本身还是一个程序员，所以大多还是会有一些技术相关的内容，各种技术的使用用法也会在里面。</p><p>我特别喜欢报纸的展示形式，因为报纸有时候可以最大限度的展示信息，于是我就参考了报纸的形式和版式来设计这个 Notion Paper 的样子。希望你也会喜欢它。</p><iframe src="https://space.linkinstars.com/notion-paper/home" width="100%" height="1000" frameborder="0" loading="lazy" allowfullscreen></iframe>]]>
    </content>
    
    
      
      
        
        
    <summary type="html">&lt;h1 id=&quot;点我跳转查看-Notion-Paper&quot;&gt;&lt;a href=&quot;#点我跳转查看-Notion-Paper&quot; class=&quot;headerlink&quot; title=&quot;点我跳转查看 Notion Paper&quot;&gt;&lt;/a&gt;&lt;a</summary>
        
      
    
    
    
    <category term="Notion Paper" scheme="https://www.linkinstars.com/categories/Notion-Paper/"/>
    
    
  </entry>
  
</feed>