You can not select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.

162 lines
7.8 KiB

Neural Enhance
==============
**Example #1** — China Town: `view comparison <http://5.9.70.47:4141/w/3b3c8054-9d00-11e6-9558-c86000be451f/view>`_ in 24-bit HD, `original photo <https://flic.kr/p/gnxcXH>`_ CC-BY-SA @cyalex.
.. image:: docs/Chinatown_example.gif
`As seen on TV! <https://www.youtube.com/watch?v=LhF_56SxrGk>`_ What if you could increase the resolution of your photos using technology from CSI laboratories? Thanks to deep learning and ``#NeuralEnhance``, it's now possible to train a neural network to zoom in to your images at 2x or even 4x. You'll get even better results by increasing the number of neurons or using specialized training images (e.g. faces).
The catch? The neural network is hallucinating details based on its training from example images. It's not reconstructing your photo exactly as it would have been if it was HD. That's only possible in Holywood — but using deep learning as "Creative AI" works and its just as cool! Here's how you can get started...
1. `Examples & Usage <#1-examples--usage>`_
2. `Installation <#2-installation--setup>`_
3. `Background & Research <#3-background--research>`_
4. `Troubleshooting <#4-troubleshooting-problems>`_
5. `Frequent Questions <#5-frequent-questions>`_
|Python Version| |License Type| |Project Stars|
----
1. Examples & Usage
===================
The main script is called ``enhance.py``, which you can run with Python 3.4+ once it's `setup <#2-installation--setup>`_ as below. The ``--device`` argument that lets you specify which GPU or CPU to use. For the samples above, here are the performance results:
* **GPU Rendering HQ** — Assuming you have CUDA setup and enough on-board RAM to fit the image and neural network, generating 1080p output should complete in 5 seconds, or 2s per image if multiple at the same time.
* **CPU Rendering HQ** — This will take roughly 20 to 60 seconds for 1080p output, however on most machines you can run 4-8 processes simultaneously given enough system RAM. Runtime depends on the neural network size.
The default is to use ``--device=cpu``, if you have NVIDIA card setup with CUDA already try ``--device=gpu0``. On the CPU, you can also set environment variable to ``OMP_NUM_THREADS=4``, which is most useful when running the script multiple times in parallel.
1.a) Enhancing Images
---------------------
.. code:: bash
# Run the super-resolution script for one or more images.
python3 enhance.py example.png
# Display output image that has `_enhanced.png` suffix.
open example_enhanced.png
1.b) Training Super-Resolution
------------------------------
.. code:: bash
rm -f ne4x.pkl.bz2
python3.4 enhance.py --train --epochs=25 \
--scales=2 --perceptual-layer=conv2_2 \
--generator-block=16 --generator-filters=128 \
--smoothness-weight=1e7 --adversary-weight=0.0
python3.4 enhance.py --train --epochs=250 \
--scales=2 --perceptual-layer=conv5_2 \
--smoothness-weight=5e4 --adversary-weight=2e2 \
--generator-start=1 --discriminator-start=0 --adversarial-start=1
**Example #2** — Bank Lobby: `view comparison <http://5.9.70.47:4141/w/38d10880-9ce6-11e6-becb-c86000be451f/view>`_ in 24-bit HD, `original photo <https://flic.kr/p/6a8cwm>`_ CC-BY-SA @benarent.
.. image:: docs/BankLobby_example.gif
2. Installation & Setup
=======================
2.a) Using Docker Image [recommended]
-------------------------------------
(work in progress)
2.b) Manual Installation [developers]
-------------------------------------
This project requires Python 3.4+ and you'll also need ``numpy`` and ``scipy`` (numerical computing libraries) as well as ``python3-dev`` installed system-wide. If you want more detailed instructions, follow these:
1. `Linux Installation of Lasagne <https://github.com/Lasagne/Lasagne/wiki/From-Zero-to-Lasagne-on-Ubuntu-14.04>`_ **(intermediate)**
2. `Mac OSX Installation of Lasagne <http://deeplearning.net/software/theano/install.html#mac-os>`_ **(advanced)**
3. `Windows Installation of Lasagne <https://github.com/Lasagne/Lasagne/wiki/From-Zero-to-Lasagne-on-Windows-7-%2864-bit%29>`_ **(expert)**
Afterward fetching the repository, you can run the following commands from your terminal to setup a local environment:
.. code:: bash
# Create a local environment for Python 3.x to install dependencies here.
python3 -m venv pyvenv --system-site-packages
# If you're using bash, make this the active version of Python.
source pyvenv/bin/activate
# Setup the required dependencies simply using the PIP module.
python3 -m pip install --ignore-installed -r requirements.txt
After this, you should have ``pillow``, ``theano`` and ``lasagne`` installed in your virtual environment. You'll also need to download this `pre-trained neural network <https://github.com/alexjc/neural-doodle/releases/download/v0.0/vgg19_conv.pkl.bz2>`_ (VGG19, 80Mb) and put it in the same folder as the script to run. To de-install everything, you can just delete the ``#/pyvenv/`` folder.
.. image:: docs/Faces_example.png
3. Background & Research
========================
This code uses a combination of techniques from the following papers, as well as some minor improvements yet to be documented:
1. `Perceptual Losses for Real-Time Style Transfer and Super-Resolution <http://arxiv.org/abs/1603.08155>`_
2. `Real-Time Super-Resolution Using Efficient Sub-Pixel Convolution <https://arxiv.org/abs/1609.05158>`_
3. `Deeply-Recursive Convolutional Network for Image Super-Resolution <https://arxiv.org/abs/1511.04491>`_
4. `Photo-Realistic Super-Resolution Using a Generative Adversarial Network <https://arxiv.org/abs/1609.04802>`_
4. Troubleshooting Problems
===========================
Can't install or Unable to find pgen, not compiling formal grammar.
-------------------------------------------------------------------
There's a Python extension compiler called Cython, and it's missing or inproperly installed. Try getting it directly from the system package manager rather than PIP.
**FIX:** ``sudo apt-get install cython3``
NotImplementedError: AbstractConv2d theano optimization failed.
---------------------------------------------------------------
This happens when you're running without a GPU, and the CPU libraries were not found (e.g. ``libblas``). The neural network expressions cannot be evaluated by Theano and it's raising an exception.
**FIX:** ``sudo apt-get install libblas-dev libopenblas-dev``
TypeError: max_pool_2d() got an unexpected keyword argument 'mode'
------------------------------------------------------------------
You need to install Lasagne and Theano directly from the versions specified in ``requirements.txt``, rather than from the PIP versions. These alternatives are older and don't have the required features.
**FIX:** ``python3 -m pip install -r requirements.txt``
ValueError: unknown locale: UTF-8
---------------------------------
It seems your terminal is misconfigured and not compatible with the way Python treats locales. You may need to change this in your ``.bash_rc`` or other startup script. Alternatively, this command will fix it once for this shell instance.
**FIX:** ``export LC_ALL=en_US.UTF-8``
.. image:: docs/OldStation_example.gif
**Example #3** — Old Station: `view comparison <http://5.9.70.47:4141/w/0f5177f4-9ce6-11e6-992c-c86000be451f/view>`_ in 24-bit HD, `original photo <https://flic.kr/p/oYhbBv>`_ CC-BY-SA @siv-athens.
----
|Python Version| |License Type| |Project Stars|
.. |Python Version| image:: http://aigamedev.github.io/scikit-neuralnetwork/badge_python.svg
:target: https://www.python.org/
.. |License Type| image:: https://img.shields.io/badge/license-AGPL-blue.svg
:target: https://github.com/alexjc/neural-enhance/blob/master/LICENSE
.. |Project Stars| image:: https://img.shields.io/github/stars/alexjc/neural-enhance.svg?style=flat
:target: https://github.com/alexjc/neural-enhance/stargazers