TensorFlow 手术分类器入门与 TensorBoard 数据可视化

3 cool machine learning projects using TensorFlow and the Raspberry Pi

图片来源

Opensource.com

深度学习中最具挑战性的部分是标签标注，正如您将在本系列文章的第一部分了解如何使用 TensorFlow 对图像进行分类中看到的那样。适当的训练对于未来有效的分类至关重要，并且为了使训练有效，我们需要大量准确标记的数据。在第一部分中，我通过下载 3,000 张预先标记的图像跳过了这个挑战。然后我向您展示了如何使用这些标记的数据通过 TensorFlow 训练您的分类器。在这一部分中，我们将使用新的数据集进行训练，并且我将介绍 TensorBoard 数据可视化工具套件，以使理解、调试和优化我们的 TensorFlow 代码更加容易。

鉴于我在医疗技术公司 C-SATS 担任工程和合规副总裁的工作，我渴望为与手术相关的内容构建一个分类器。缝合术似乎是一个很好的起点。它立即有用，而且我知道如何识别它。它之所以有用，是因为，例如，如果机器能够看到何时正在进行缝合，它可以自动识别手术步骤（阶段）中缝合发生的位置，例如吻合术。我可以识别它，因为即使对于我这个外行来说，手术缝线的针和线也是清晰可见的。

我的目标是训练机器识别医疗视频中的缝合术。

我可以访问数十亿帧不可识别的手术视频，其中许多包含缝合术。但我又回到了标签标注问题。幸运的是，C-SATS 拥有一支经验丰富的注释员队伍，他们是这方面的专家。我的源数据是视频文件和 JSON 格式的注释。

注释看起来像这样

[
    {
        "annotations": [
            {
                "endSeconds": 2115.215,
                "label": "suturing",
                "startSeconds": 2319.541
            },
            {
                "endSeconds": 2976.301,
                "label": "suturing",
                "startSeconds": 2528.884
            }
        ],
        "durationSeconds": 2975,
        "videoId": 5
    },
    {
        "annotations": [
        // ...etc...

我编写了一个 Python 脚本来使用 JSON 注释来决定从 .mp4 视频文件中抓取哪些帧。ffmpeg 负责实际抓取。我决定每秒最多抓取一帧，然后将视频总秒数除以四得到 10k 秒（10k 帧）。在我弄清楚要抓取哪些秒后，我运行了一个快速测试，看看特定的秒是否在注释为缝合的片段内部或外部（下面代码中的 isWithinSuturingSegment()）。这是 grab.py

#!/usr/bin/python
 
# Grab frames from videos with ffmpeg. Use multiple cores.
# Minimum resolution is 1 second--this is a shortcut to get less frames.
 
# (C)2017 Adam Monsen. License: AGPL v3 or later.
 
import json
import subprocess
from multiprocessing import Pool
import os
 
frameList = []
 
def isWithinSuturingSegment(annotations, timepointSeconds):
    for annotation in annotations:
        startSeconds = annotation['startSeconds']
        endSeconds = annotation['endSeconds']
        if timepointSeconds > startSeconds and timepointSeconds < endSeconds:
            return True
    return False
 
with open('available-suturing-segments.json') as f:
    j = json.load(f)
 
    for video in j:
        videoId = video['videoId']
        videoDuration = video['durationSeconds']
 
        # generate many ffmpeg frame-grabbing commands
        start = 1
        stop = videoDuration
        step = 4 # Reduce to grab more frames
        for timepointSeconds in xrange(start, stop, step):
            inputFilename = '/home/adam/Downloads/suturing-videos/{}.mp4'.format(videoId)
            outputFilename = '{}-{}.jpg'.format(video['videoId'], timepointSeconds)
            if isWithinSuturingSegment(video['annotations'], timepointSeconds):
                outputFilename = 'suturing/{}'.format(outputFilename)
            else:
                outputFilename = 'not-suturing/{}'.format(outputFilename)
            outputFilename = '/home/adam/local/{}'.format(outputFilename)
 
            commandString = 'ffmpeg -loglevel quiet -ss {} -i {} -frames:v 1 {}'.format(
                timepointSeconds, inputFilename, outputFilename)
 
            frameList.append({
                'outputFilename': outputFilename,
                'commandString': commandString,
            })
 
def grabFrame(f):
    if os.path.isfile(f['outputFilename']):
        print 'already completed {}'.format(f['outputFilename'])
    else:
        print 'processing {}'.format(f['outputFilename'])
        subprocess.check_call(f['commandString'].split())
 
p = Pool(4) # for my 4-core laptop
p.map(grabFrame, frameList)

现在我们准备好再次重新训练模型，就像之前一样。

使用此脚本剪辑出 10k 帧花了我大约 10 分钟，然后花了一个小时左右重新训练 Inception 以识别缝合术，准确率达到 90%。我使用来自训练集的新数据进行了抽查，我尝试的每一帧都被正确识别（平均置信度得分：88%，中位数置信度得分：91%）。

这是我的抽查。（警告：包含血液和内脏图像的链接。）

图像	非缝合得分	缝合得分
Not-Suturing-01.jpg	0.71053	0.28947
Not-Suturing-02.jpg	0.94890	0.05110
Not-Suturing-03.jpg	0.99825	0.00175
Suturing-01.jpg	0.08392	0.91608
Suturing-02.jpg	0.08851	0.91149
Suturing-03.jpg	0.18495	0.81505

如何使用 TensorBoard

可视化底层发生的事情并与他人沟通，在深度学习中至少与其他任何类型的软件一样困难。TensorBoard 来解救！

来自第一部分的 Retrain.py 自动生成 TensorBoard 用于生成表示重新训练期间发生的情况的图表的文件。

要设置 TensorBoard，请在运行 retrain.py 后在容器内运行以下命令。

pip install tensorboard
tensorboard --logdir /tmp/retrain_logs

观察输出并在浏览器中打开打印的 URL。

Starting TensorBoard 41 on port 6006
(You can navigate to http://172.17.0.2:6006)

您会看到类似这样的内容

我希望这会有所帮助；如果不是，您至少可以展示一些很酷的东西。在重新训练期间，我发现在“SCALARS”选项卡下查看准确率如何随着训练步骤的增加而增加，而交叉熵如何降低很有帮助。这正是我们想要的。

了解更多

如果您想了解更多信息，请浏览以下资源

Pete Warden 精彩的 TensorFlow for poets 是一个很好的、完全实用的关于使用 Inception 进行迁移学习的教程，但其中一些链接已损坏。这个分步教程是最新的，并且分解很方便。
要获得更多代码和更深入的解释，请尝试 tensorflow.org 上的图像识别和重新训练教程。
在学习时，我更喜欢阅读而不是观看，但我发现视频在 5 分钟内构建 TensorFlow 图像分类器非常有用且完整。如果您喜欢不那么愚蠢的东西，也许可以转向 Josh Gordon 的教程，例如这个。

以下是我在本系列文章中使用的其他资源，可能对您也有帮助