쿠...sal

[컴] wsl2 에서 bcc 설치 및 실행

wsl2 에서 bcc 설치 및 실행

절차

wsl2 kernel build
bcc 설치 및 실행

1. wsl2 kernel build

wsl2 linux kernel source 를 받아서 wsl2 ubuntu 에서 build 한다.

다음 link 에서 kernel source 를 받자.
- Release linux-msft-wsl-5.10.74.3 · microsoft/WSL2-Linux-Kernel · GitHub
- 5.10.74.3 부터 ebpf 관련 flag 가 on 됐다.
build 를 위한 config 설정
- bcc 를 실행하기 위해 kernel build 할 때 다음 flag 들을 설정후 build 해야 한다.
- https://github.com/iovisor/bcc/blob/master/INSTALL.md#kernel-configuration
- Microsoft/config-wsl 를 <proj_root> 에 copy 해 와서 flag 를 설정하면 된다.

build

cpu : i5-9400F @ 2.9GHz
wsl2 에서 build, 대략 27분 소요

<proj_root>\vmlinux : build 후 생성된 kernel

sudo apt install build-essential flex bison dwarves libssl-dev libelf-dev
tar xvf WSL2-Linux-Kernel-linux-msft-wsl-5.10.74.3.tar.gz
cd WSL2-Linux-Kernel-linux-msft-wsl-5.10.74.3

# 만약, 커널 버전 이름을 변경하고 싶다면, 아래처럼 하면 된다. 굳이 수정할 필요는 없다.
# export KERNELRELEASE=5.10.74.3-microsoft-standard
# make KCONFIG_CONFIG=Microsoft/config-wsl KERNELRELEASE=$KERNELRELEASE -j 4

sudo apt-get install flex bison libssl-dev libelf-dev dwarves

cp ./Microsoft/config-wsl .config
# config path 설정은 아래처럼 할 수 있다.
# make KCONFIG_CONFIG=Microsoft/config-wsl -j 4
make -j 4
make modules -j 4
sudo make modules_install

ls /lib/modules/ : /lib/modules 가 만들어진다.

새로 build 한 kernel 을 사용하도록 설정

vmlinux를 windows 로 복사

mkdir /mnt/c/Users/myuserid/kernel
cp vmlinux /mnt/c/Users/myuserid/kernel/vmlinux-5.10.74.3

c:\Users\myuserid\.wslconfig 설정

[wsl2]

kernel=c:\\Users\\myuserid\\kernel\\vmlinux-5.10.74.3

새롭게 wsl2 시작

wsl --shutdown
wsl -d Ubuntu-20.04
...
# kernel version 확인
uname -r
5.10.74.3-microsoft-standard-WSL2

2. bcc 설치 및 실행

Ubuntu - Binary | iovisor/bcc · GitHub

sudo apt-get update
sudo apt-get install bpfcc-tools
sudo opensnoop-bpfcc

opensnoop-bpfcc 를 켜놓은 상태에서 curl 을 한번 쳐보면, 이 tool이 어떻게 동작하는지 볼 수 있다.

opensnoop 은 `open()`` syscall에 대해 한 줄의 출력을 print 한다. 이렇게 어떤 file 이 open 됐는지를 보면서, data file 들, config filed 들 , log file 들이 무엇인지 확인할 수 있다.(참고: 1.2. opensnoop | bcc/docs/tutorial.md)

Reference

[컴] wsl2 에서 default user 변경

wsl2 에서 default user 변경

/etc/wsl.conf 에서 수정
- /etc/wsl.conf 가 존재하지 않으면 만들면 된다.

[user]
default=username

특정 userid 로 wsl 실행시 : wsl -d <distribution_name> -u root
ubuntu config --default-user johndoe
- Change the default user for a distribution | Microsoft Learn
- 이 방법은 microsoft store 를 통해 설치한 경우에만 해당한다.

Reference

linux - How to set default user for manually installed WSL distro? - Super User

[컴] eBPF 정리

eBPF 정리

eBPF 는 Berkeley Packet Filter에 e(extend)를 붙여서 만든 이름인 듯 하다. wiki 의 내용을 보면 BPF 는 network traffic 분석용으로 많이 쓰였던 것 같다.

이 부분이 확장돼서 이제는 kernel mode 에서 실행돼야 하는 script 를 실행할 수 있게 해주는 기술이 됐다.

이전에 kernel 에 기능을 넣으려면, code 를 수정해서 다시 빌드 하던지, kernel module들을 load 해야 했다.[ref. 1]

static code analysis 와 crash, hang 또는 다른 kernel 에 부정적인 영향을 미치는 것들을 reject 하는 것으로 안정성(safety)을 제공한다.

자동으로 거부되는 프로그램의 예

강력한 종료 보장이 없는 프로그램(즉, 종료 조건이 없는 for/while 루프)
safety-checks 없는 pointers들을 dereference하는 프로그램

verifier를 통과한 loaded program 은, 빠른 성능을 위해서, interpreted 되거나 in-kernel JIT compiled 된다. 그리고 나서, program들은 os kernel 의 여러 hook point 들에 attached 돼서 event 가 발생할 때 실행된다.

대략적인 개념은

system call 에서 함수를 호출할 때 마다, eBPF 함수를 호출하도록 만들어 놓은 듯 하다. 그래서 정의된 handler 가 없으면, 그냥 넘어가고, 있으면 실행하게 되는 듯 하다.

windows 에서 ebpf

GitHub - microsoft/ebpf-for-windows: eBPF implementation that runs on top of Windows

windows 에서도 ebpf 를 만들고 있다. 기존 ebpf 위에 layer를 추가해서 windows 에서도 실행되도록 한다.

지원버전:

windows 10 이상
windows server 2019 이상

wsl

쿠...sal: [컴] wsl2 에서 bcc 설치 및 실행

helloworld for wsl2

https://gist.github.com/MarioHewardt/5759641727aae880b29c8f715ba4d30f?permalink_comment_id=4142547#gistcomment-4142547

#!/usr/bin/python3
from bcc import BPF
from time import sleep

program = """
BPF_HASH(clones);

int hello_world(void *ctx) {
    u64 uid;
    u64 counter = 0;
    u64 *p;

    uid = bpf_get_current_uid_gid() & 0xFFFFFFFF;
    p = clones.lookup(&uid);
    if (p != 0){
        counter = *p;
    }

    counter++;
    clones.update(&uid, &counter);

    return 0;
}
"""

b = BPF(text=program)
clone = b.get_syscall_fnname("clone")
b.attach_kprobe(event=clone, fn_name="hello_world")
b.trace_print()

sleep(1000)

while True:
    sleep(2)
    s = ""
    if len(b["clones"].items()):
        for k,v in b["clones"].items():
            s += "ID {}: {}\t".format(k.value, v.value)
        print(s)
    else:
        print("No entries yet")

Reference

[컴] adobe creative cloud 를 kill 하는 powershell script

ps example / powershell kill process / example / kill tree ps

adobe creative cloud 를 kill 하는 powershell script

How to stop Adobe Desktop Service.exe and Node.exe… - Adobe Support Community - 10435164

powershell script

function Kill-Tree {
    Param([int]$ppid)
    Get-CimInstance Win32_Process | Where-Object { $_.ParentProcessId -eq $ppid } | ForEach-Object { Kill-Tree $_.ProcessId }
    Stop-Process -Id $ppid
}

function Kill-Process {
    # kill the processes which have the same names
    Param([String]$pname)
    echo "kill $pname"
    $pids=(get-process $pname).id
    For ($i=0; $i -lt $pids.length; $i++){
        $pi = $pids[$i]
        Kill-Tree $pi
    }
}


Kill-Process "Adobe Desktop Service"
Kill-Process "AdobeIPCBroker"
Kill-Process "Creative Cloud Helper"
Kill-Process "CCXProcess"
Kill-Process "CCLibrary"
Kill-Process "CoreSync"

Reference

Terminate process tree in PowerShell given a process ID - Stack Overflow

[컴] storage and retrieval

db index / index 가 저장되는 방법

storage and retrieval

ref.1 의 storage and retrieval 부분을 정리중…

index 저장에 대한 구현방법

log 를 append 를 하는 것이 빠르게 write 할 수 있는 방법. 기존의 것에 overwrite 하는 것보다 유리하다.
이것은 특정 size 의 segment 까지만 저장하고, 한도를 넘어가면 새로운 파일로 저장. 이런식으로 여러 segmemt를 저장한다.

compaction 과 merge:

log를 write 하면서 동시에 가장 최근 값만 남기고 버리는 작업을 진행. 이것을 compaction 이라고 하자.
이 compaction 을 하면서 동시에 다른 segment 것과 합치는 작업(merge) 를 진행할 수도 있다.
compaction 과 merge 를 완료하면, 최종적으로 모든 segement 에서 1개의 key 만 남게 된다.

hash table 한계:

각 segment 에 대한 hash table 은 메모리에 둔다. hash table 은. file 버전은 효율적이지 않아서 없다.
- file 버전의 hash table 은 많은 랜덤 액세스 I/O가 필요하고, 가득 차면 커지는 데 비용이 많이 들며, 해시 충돌에는 까다로운 로직이 필요하다.
hash table은 range query에는 적합하지 않다. kitty00000과 kitty99999 사이의 모든 키를 스캔하려 할 때, 할 수 없으며 해시 맵에서 각 키를 개별적으로 조회해야만 한다.

SSTable:

sorted string table 의 약자다.
각 segment는 key 순으로 정렬되게 insert한다.(이러면 당연히 sequential insert 가 안된다.) 그리고 각 segment 에서 compaction 을 시킨다.(그러면, 키가 unique하게 된다.) 이제 이 segement들을 merge 하면서 sort 한다.(merge sort)
이렇게 되면, 모든 key 에 대한 hashmap 을 갖지않고, 듬성듬성 key 를 갖는 sparse hash map 으로도 괜찮다. 근처에 있는 key 의 위치로 가서, 그 안에서 다시 찾아들어가면 된다.
이것은 hash map 으로 찾아들어간 key들이 들어있는 chunk 의 경우는 압축해서 보관해도 된다. 압축을 하게 되면, I/O 대역폭 사용을 줄이고, disk 공간도 줄 일 수 있다.
insert 할 때부터, key를 정렬하면서 insert 를 하는 방법은 AVL tree 나 red-black tree 등이 있다.
B-tree 등을 이용하면, disk 에서도 이 정렬된 구조를 유지하는 것이 가능하지만, 메모리에서 이 정렬된 구조(structure)를 유지하는 것이 쉽다.

이제 storage engine 이 동작은 이렇게 하게 된다.

balanced tree 가 메모리에 있다.
write 이 들어오면, balanced tree 에 추가하게 된다.
이 in-memory tree 를 memtable 이라 하자.
이 memtable 이 대략적으로 수MB 정도로 커지면, 지금까지의 tree 를 file로 저장한다.
이 때 이 file 은 SSTable(Sorted String Table, key 로 정렬된 data 를 가진 chunk 로 보면 된다. 즉, tree 의 leaft node를 순서대로 저장한 모양이 될 것) 로 저장한다.
이것이 효과적인 이유는 tree 가 이미 key로 정렬된 key-value pair들을 유지하고 있기 때문이다.
새롭게 만들어진 SSTable file 이 가장 최근 db segment 가 된다.
이 SSTable file 로 이전의 key 가 저장되는 동안에도 write 가 들어오면, 새로운 memtable 이 생성된다.

결과적으로, 하나의 SSTable 에는 특정시간동안 쌓인 key 들이 있게 된다.

read 를 처리할때는

먼저 memtable 에서 key를 찾는다.
여기에 없으면, 가장최근의 disk segment 로 간다. 거기도 없으면, 그 다음 오래된 segment 로 가서 key 를 찾게 된다.
때떄로 merge 및 compaction process 를 backgroud 에서 실행한다. 이 때 SSTable 이 줄어든다.
이것을 실행해서, segment 파일들을 merge하고, overwritten 된 값이나 delted 된 값들을 버린다.

이때 만약 memtable 이 날라갈 때를 대비해서, write 될때 그냥 sequential 한 log 를 남기는 것으로 이 문제를 어느정도 커버할 수 있다. 그리고 이 로그는 SSTable 를 만들때 discard 하면 된다.

Reference

Designing Data Intensive Applications

[컴] gmail 에서 받은 mbox 를 python 으로 처리

mbox

gmail 에서 받은 mbox 를 python 으로 처리

gmail 에서 data backup 을 하면, .mbox 형식으로 다운로드 할 수 있다. 이것을 읽어드려서 보려면, mbox viewer 로 잠깐 demo 기간만 이용해도 된다.

MBOX Viewer Software Free to Read MBOX Files of Mac & Windows

개인적으로 처리방법을 찾아봤다. 무료로 html 로 변화해주는 것들을 찾아봤는데, linux 용으로 몇개가 보였다. 한개는 c 로 만든 것이라 compile 해야 해서 일단 pass 했다.

GitHub - hypermail-project/hypermail: Hypermail is a free (GPL) program to convert email from Unix mbox format to html.

이것저것 찾다가 간단하게 python 으로 된 녀석을 찾았다.

Quick python code to parse mbox files, specifically those used by GMail. Extracts sender, date, plain text contents etc., ignores base64 attachments. · GitHub
- pip install mailbox bs4 lxml 이 필요

이것으로 읽어서 원하는 모양으로 output 을 만드는 것이 가장 나은 선택일 듯 싶다.

[컴] jooq 사용, jooq gradle plugin

jooq란?/

jooq 사용, jooq gradle plugin

여기서는 jooq 를 사용하는데, jooq gradle plugin 을 이용해서 사용한다.

jooq 팀에서도 Grdle plugin 을 쓰는 것을 권장한다.

여기서는 gradle-jooq-plugin 의 README.md 의 내용을 기초로 사용법을 설명한다.

etiennestuder/gradle-jooq-plugin: Gradle plugin that integrates jOOQ.

가장 간단한 예제는 아래와 같다.

gradle-jooq-plugin/example/configure_generation_tool_execution/build.gradle at main · etiennestuder/gradle-jooq-plugin · GitHub

`generationTool`

setup for generationTool

여기서는 mariadb와 연동되는 springboot 에 적용해서 mariadb 의 table, record 관련된 class 를 generationTool 을 사용해서 만드는 작업을 해보려 한다.

먼저, 다음링크를 참고해서 springboot 을 준비하자.
- 쿠…sal: [컴] SpringBoot web server 를 위한 helloworld

jooq.gradle : 여기에 다음처럼 *.gradle 파일을 추가하자. 그냥 build.gradle 에 전부 넣을 수도 있다. 그것은 여기를 참고하자.

build.gradle

...
apply from: 'jooq.gradle'
...

jooq.gradle:
- database.inputSchema 를 지정해주지 않으면, 전체 database 에 대해 class 를 생성한다.
- database.include, database.exclude : code generation 할때, include, exclude 조건을 넣을 수 있다. 예를 들면, 특정 이름의 table 에 대한 code generation 을 하거나 할 수 있다. 자세한 것은 Includes and Excludes 를 참고하자.

import nu.studer.gradle.jooq.JooqEdition

buildscript {
  repositories {
    gradlePluginPortal()
  }

  dependencies {
    classpath 'nu.studer:gradle-jooq-plugin:8.2'
  }
}


apply plugin: nu.studer.gradle.jooq.JooqPlugin

repositories {
  mavenCentral()
}

dependencies {
  // mariadb 가 아니라면, 다른 client 를 사용하게 된다.
  jooqGenerator 'org.mariadb.jdbc:mariadb-java-client'
}

jooq {
  configurations {
    main {
      generationTool {
        logging = org.jooq.meta.jaxb.Logging.WARN
        jdbc {
          url = 'jdbc:mariadb://localhost:3307/stest?useUnicode=true&characterEncoding=utf-8'
          user = 'root'
          password = 'root'
          driver = 'org.mariadb.jdbc.Driver'
          properties {
            property {
              // 이건 그냥 자신이 원하는 변수를 지정하는 부분으로 보면된다.
              key = 'PAGE_SIZE'
              value = 2048
            }
          }
        }
        generator {
          name = 'org.jooq.codegen.DefaultGenerator'
          database {
            name ='org.jooq.meta.mariadb.MariaDBDatabase'
            inputSchema = 'stest'   // 이것을 지정해주면, 여기 지정된 database의 schema 의 class 만 만든다.
            include = '.*'
            exclude = 'flyway_schema_history'
          }
          target {
            packageName = 'com.namh.namhex'
          }
          strategy {
            name = "org.jooq.codegen.DefaultGeneratorStrategy"
          }
        }
      }
    }
  }
}

gradlew.bat generateJooq 를 실행하면 class file 들이 만들어진다.
- 위 같은 경우 packageName 이 com.namh.namhex 이기에 다음경로에 DB table, record에 대한 class file 이 다음 경로에 만들어진다.
  - <root>\build\generated-src\jooq\main\com\namh\namhex

d:\namhex000>gradlew generateJooq

> Task :generateJooq
SLF4J: No SLF4J providers were found.
SLF4J: Defaulting to no-operation (NOP) logger implementation
SLF4J: See https://www.slf4j.org/codes.html#noProviders for further details.

Deprecated Gradle features were used in this build, making it incompatible with Gradle 9.0.

You can use '--warning-mode all' to show the individual deprecation warnings and determine if they come from your own scripts or plugins.

For more on this, please refer to https://docs.gradle.org/8.2.1/userguide/command_line_interface.html#sec:command_line_warnings in the Gradle documentation.

BUILD SUCCESSFUL in 5s
1 actionable task: 1 executed

jooq gradle-plugin examples

gradle-jooq-plugin/example at main · etiennestuder/gradle-jooq-plugin · GitHub

jooq 사용, jooq gradle plugin

db에 접속해서 table의 schema 를 얻는 것이 아니라 local 에 있는 schema 를 이용해서 만들수도 있다.

jooq slow query

Reference

Getting Started with jOOQ | Baeldung

쿠...sal

[컴] wsl2 에서 bcc 설치 및 실행

wsl2 에서 bcc 설치 및 실행

절차

1. wsl2 kernel build

2. bcc 설치 및 실행

Reference

[컴] wsl2 에서 default user 변경

wsl2 에서 default user 변경

Reference

[컴] eBPF 정리

eBPF 정리

windows 에서 ebpf

wsl

helloworld for wsl2

See Also

Reference

[컴] adobe creative cloud 를 kill 하는 powershell script

adobe creative cloud 를 kill 하는 powershell script

powershell script

Reference

[컴] storage and retrieval

storage and retrieval

index 저장에 대한 구현방법

See Also

Reference

[컴] gmail 에서 받은 mbox 를 python 으로 처리

gmail 에서 받은 mbox 를 python 으로 처리

[컴] jooq 사용, jooq gradle plugin

jooq 사용, jooq gradle plugin

`generationTool`

setup for generationTool

jooq gradle-plugin examples

jooq 사용, jooq gradle plugin

jooq slow query

Reference