Author: wangliwen1994

38 Posts

[Reading Notes] Arbitrary Style Transfer in Real-time with Adaptive Instance Normalization
Source Paper: [ICCV'2017] https://arxiv.org/abs/1703.06868 Authors: Xun Huang, Serge Belongie Code: https://github.com/xunhuang1995/AdaIN-style Contributions In this paper, the authors present a simple yet effective approach that for the first time enables arbitrary style transfer in real-time. Arbitrary style transfer: takes a content image $C$ and an arbitrary style image $S$ as inputs, and synthesizes an output image with the same content as $C$ and the same syle as $S$. Background Batch Normalization Given a input batch $x \in \mathbb{R}^{N \times C \times H \times W}$, batch normalization (BN) normalizes the mean and standard deviation for each individual feature channel: $$ \mathrm{BN}(x)=\gamma\left(\frac{x-\mu(x)}{\sigma(x)}\right)+\beta $$ where $\gamma , \beta \in \mathbb{R}^{C}$ are affine parameters learned from data. $\mu(x) , \sigma(x) \in \mathbb{R}^{C}$ are mean and standard deviation computed across batch size and spatial dimensions, independently. $$ \mu_{c}(x)=\frac{1}{N H W} \sum_{n=1}^{N} \sum_{h=1}^{H} \sum_{w=1}^{W} x_{n c h w} $$ $$ \sigma_{c}(x)=\sqrt{\frac{1}{N H W} \sum_{n=1}^{N} \sum_{h=1}^{H} \sum_{w=1}^{W}\left(x_{n c h w}-\mu_{c}(x)\right)^{2}+\epsilon} $$ Instance Normalization Original feed-forward stylization method [51] utilizes BN layers after the convolutional layer. Ulyanov et al. [52] found using Instance Normalization…
[Reading Notes] Collaborative Distillation for Ultra-Resolution Universal Style Transfer
Source Authors: Huan Wang, Yijun Li, Yuehai Wang, Haoji Hu, Ming-Hsuan YangPaper: [CVPR2020] https://arxiv.org/abs/2003.08436Code: https://github.com/mingsun-tse/collaborative-distillation Contributions It proposes a new knowledge distillation method "Collobrative Distillation" based on the exclusive collaborative relation between the encoder and its decoder. It proposes to restrict the students to learn linear embedding of the teacher's outputs, which boosts its learning. Experimetenal works are done with different stylization frameworks, like WCT and AdaIN. Related Works Style Transfer WCT: Li, Y., Fang, C., Yang, J., Wang, Z., Lu, X., & Yang, M. H. (2017). Universal style transfer via feature transforms. arXiv preprint arXiv:1705.08086.AdaIN: Huang, X., & Belongie, S. (2017). Arbitrary style transfer in real-time with adaptive instance normalization. In Proceedings of the IEEE International Conference on Computer Vision (pp. 1501-1510). Model Compression low-rank decomposition pruning quantization knowledge distillationKnowledge distillation is a promising model compression method by transferring the knowledge of large networks (called teacher) to small networks (called student), where the knowledge can be softened probability (which can reflect the inherent class similarity structure known as dark knowledge) or sample relations (which…
NVIDIA Xavier Depolyment: ONNXRuntime and TensorRT
Installation Virtual Environment - archiconda Install the archiconda environment at terminal wget https://github.com/Archiconda/build-tools/releases/download/0.2.3/Archiconda3-0.2.3-Linux-aarch64.sh sh Archiconda3-0.2.3-Linux-aarch64.sh Reference: https://blog.csdn.net/qq_40691868/article/details/114362278?spm=1001.2014.3001.5501 [Not Nesessary] In order to enter the system path environment when enter the terminal. It needs to comment the code "conda activate base" in ".bashrc" # added by Archiconda3 0.2.3 installer # >>> conda init >>> # !! Contents within this block are managed by 'conda init' !! __conda_setup="$(CONDA_REPORT_ERRORS=false '/home/jetson/archiconda3/bin/conda' shell.bash hook 2> /dev/null)" if [ $? -eq 0 ]; then \eval "$__conda_setup" else if [ -f "/home/jetson/archiconda3/etc/profile.d/conda.sh" ]; then . "/home/jetson/archiconda3/etc/profile.d/conda.sh" CONDA_CHANGEPS1=false #conda activate base else \export PATH="/home/jetson/archiconda3/bin:$PATH" fi fi unset __conda_setup # <<< conda init <<< create a new environment It better to keep the python version consistent with the system. conda create --name mytest python=3.6.9 conda activate mytest Connect the prebuild packages to the virtual environment (useless, need further verification) Enter the python interactive command, there is no opencv package. It needs to allows the conda environment to be reintroduced to the global/user site packages. Enter the virtual environment through export PYTHONNOUSERSITE=0 conda activate <YOUR_ENVIROMENT>…
Cheat Sheet of Ubuntu
TODO This post contains the solutions to some common bugs occured in deep learning with Ubuntu. It contains: Basic usage of Ubuntu CUDA Ubuntu Add/delete user # add a new user NEW_USER_NAME sudo adduser -m NEW_USER_NAME # set password sudo pass NEW_USER_NAME # delete a user sudo deluser NEW_USER_NAME # Fix a bug: if the console of the new user does # NOT display current directory OR only display $ # OR cannot use most of the shell command # It means the path "bin" is not connected correctly # delete the user and add it through: useradd -s /bin/bash -d /home/xxx -m xxx Install CUDA Install CUDA 10.2 To use CUDA 10.2 as an example, the official instruction is like: from https://developer.nvidia.com/cuda-10.2-download-archive?target_os=Linux&target_arch=x86_64&target_distro=Ubuntu&target_version=1804&target_type=runfilelocal Command: # for ubuntu 20.04, it needs to install GCC 8 # the maximum support version of CUDA 10.2 is GCC 8, # Install GCC 8, and switch system GCC version to GCC 8 sudo apt-get install gcc-8 g++-8 sudo update-alternatives --install /usr/bin/gcc gcc /usr/bin/gcc-8 800 --slave /usr/bin/g++ g++ /usr/bin/g++-8 # download…
Cheat Sheet of OpenCV
TODO in progress Image Related TODO Video Related Basic video reading video_file = "path/to/video/file.mp4" cap = cv2.VideoCapture(video_file) fps = cap.get(cv2.CAP_PROP_FPS) length = cap.get(cv2.CAP_PROP_FRAME_COUNT) print("Video File %s ==> fps: %.2f, total: %d frames" % (video_file, fps, length)) video_start = int(0) # start from index 0 video_end = video_start + 50000 # read the first 50000 frames cap.set(1, video_start) for frame_index in range(video_start, video_end): ret, frame = cap.read() if not ret: break cv2.putText(frame, "Frame:" + str(frame_index), (20, 100), cv2.FONT_HERSHEY_SIMPLEX, 1, (255,255,255), 2) cv2.imshow("frame", frame) cv2.waitKey(0) Reading with Specific Index example: read every 5s video_file = "path/to/video/file.mp4'" cap = cv2.VideoCapture(video_file) fps = cap.get(cv2.CAP_PROP_FPS) length = cap.get(cv2.CAP_PROP_FRAME_COUNT) print("Video File %s ==> fps: %.2f, total: %d frames" % (video_file, fps, length)) video_start = int(0) # start from index 0 video_end = video_start + 50000 # read the first 50000 frames cap.set(1, video_start) for frame_index in range(video_start, video_end): if not frame_index%int(5*fps)==0: continue else: cap.set(1, frame_index) ret, frame = cap.read() if not ret: break cv2.putText(frame, "Frame:" + str(frame_index), (20, 100), cv2.FONT_HERSHEY_SIMPLEX, 1, (255,255,255), 2) cv2.imshow("frame", frame) cv2.waitKey(0) Convert a Video File to…
How to Schedule a Meeting via Zoom
Step 1: Open "Zoom" software from the Desktop/Start_menu Step 2: Click the “Schedule” at the homepage of Zoom Step 3: Set the basic information of the scheduled meeting Step 4: Copy the invitation link at the homepage of Zoom. Step 5: Paste the link, and send to others An example:
Scaled-YOLOv4: Scaling Cross Stage Partial Network
Scaled-YOLOv4: Scaling Cross Stage Partial Network In this reading notes: We have reviewed some basic model scaling method: width, depth, resolution, compound scaling. We have computed the operation amount of residual blocks, and showed the relation with input image size (square), number of layers (linear), number of filters (square). We have presented the proposed Cross-Stage Partial (CSP) method that decreases the operations and improves the performance of basic CNN layers. PPT can be download from: https://connectpolyu-my.sharepoint.com/:p:/g/personal/18048204r_connect_polyu_hk/ET9zlHku9TFApqdl1A5NTV8BjFXPLizhCMupm6Ohcbehig?e=hhLlyc This is an embedded Microsoft Office presentation, powered by Office.
YOLObile: Real-Time Object Detection on Mobile Devices via Compression-Compilation Co-Design
Paper Information Paper: YOLObile: Real-Time Object Detection on Mobile Devices via Compression-Compilation Co-Design Authors: Yuxuan Cai, Hongjia Li, Geng Yuan, Wei Niu, Yanyu Li, Xulong Tang, Bin Ren, Yanzhi Wang Paper: https://arxiv.org/abs/2009.05697 Github: https://github.com/nightsnack/YOLObile Objective: Real-time object detection for mobile devices. Study notes and presentation: Download: https://connectpolyu-my.sharepoint.com/:p:/g/personal/18048204r_connect_polyu_hk/EcRbix5iqshBglmxuLurS-sBBFmbrk8chRkim1y54-yOXw?e=8Qdfmd This is an embedded Microsoft Office presentation, powered by Office.
DANet: Dual Attention Network for Scene Segmentatio
Abstract The paper introduces a position attention module and a channel attention module to capture global dependencies in the spatial and channel dimensions respectively. The proposed DANet adaptively integrates local semantic features using the self-attention mechanism. 摘要 本文引入了位置关注模块和通道关注模块,分别在空间和通道维度上捕捉全局依赖性。 所提出的DANet利用自注意力机制自适应地集成局部语义特征。 Outline Brief Review: attention mechanism, SE net DANet: Dual Attention Network Experiments: visualization and comparison Conclusion 大纲 回顾:注意机制、SENet DANet: 双重关注网络 实验:可视化和对比 结论 Download: https://connectpolyu-my.sharepoint.com/:p:/g/personal/18048204r_connect_polyu_hk/EbgphNjvYP5Psw5gdgDjInQBs761z4x8FYboKXF2arT6kw?e=haTOHI This is an embedded Microsoft Office presentation, powered by Office.