쿠...sal

[컴] java 에서 stream 과 reader/writer 의 차이

java io / i/o 이해 /

java 에서 stream 과 reader/writer 의 차이

java에서 2개의 I/O class 가 있다.

stream
readers/writers

stream

stream 이 큰 범위라고 보면 된다. 모든 io관련 작업들을 할 수 있는 것이 stream 이다. binary data 읽고, 쓰는 작업들이 존재하면 거기에 stream 을 사용한다고 보면 된다. 예를 들어 network 에서 data를 읽고 쓰기 하면 NetworkReadStream 이 있다고 생각하면 된다. 실제로는 Socket.getInputStream() 등을 통해 얻을 수 있다. 그래서 stream 이 가능한 곳들은 data를 주고받는 모든 장치들이 될 것이다.

network
file
data 를 주고받는 다른 장치들

readers/writers

이 readers/writers 는 text 를 읽고, 쓸 때 쓰인다. stream 위에 layer 를 하나 더 둔 것이다. 그래서 binary data 를 일고 이것을 character 로 바꿔서 주는 것이다.

buffered I/O

그런데 이렇게 한 byte씩 읽고 쓰는 것은 느리다. 그래서 메모리에 buffer를 두고 사용한다.

ref. 1에 좋은 예제를 보여주는데,

BufferedReader br = new BufferedReader(new InputStreamReader(System.in));

위는 System.in 이라는 InputStream 에 Reader를 씌운것이다. 그래서 System.in 이라는 InputStream 에서 읽어오는 byte를 text 로 변환하는 것이다. 그리고 이 값을 BufferedReader 로 씌운다. 그러면 이것을 읽어서 buffer 에 담는 reader가 만들어지게 되는 것이다.

Reference

[컴] python io 성능관련 글

python bulk write / text write to file / 파일 write / file io 빠르게 /

python io 성능관련 글

Text I/O

text stream 을 만드는 간단한 방법은 open() 함수를 사용하는 것

f = open("mytext", "r")

in-memory text stream 들은 StringIO object들을 이용하면 된다.

f = io.StringIO("initial text")

Binary I/O

Binary I/O 를 buffered I/O 라고 도 부른다.

f = open("myfile.jpg", "rb")

in-memory binary stream

f = io.BytesIO(b"some initial binary data: \x00\x01")

성능

ref.1 에 좋은 글이 있어서 일부 번역을 해 놓는다.

다음 4가지에 IO에 대한 성능 이야기다.

Binary I/O
Text I/O
Multi-threading
Reentrancy

Binary I/O

buffered I/O 는 사용자가 1 byte를 요청해도, 많은 양의 데이터만 읽고 기록함으로써 os의 unbuffered I/O 루틴들을 불러서 수행할 때 생기는 비효율을 숨긴다. 이득(benefit)은 OS와 ’수행하는 I/O 종류’에 달려있다. 예를 들면, linux 같은 최신 OS들 에서는 unbuffered disk I/O 가 buffered I/O 보다 더 빠를 수 있다. 그러나 최소한(bottom line), buffered I/O 는 지원장치(backing device)나 플랫폼에 상관없이 예측가능한 성능을 제공한다는 것이다. 그런이유로 ’buffered I/O 를 사용하는 것’은 binary data 에 대해서 unbuffered I/O를 사용하는 것보다는 거의 언제나 선호된다.

Text I/O

file같은 binary storage(저장)에 대한 Text I/O 는 같은 저장에 대한 binary I/O 에 비해 현저히 늦다. 왜냐하면, Text I/O 는 unicode 와 binary data 사이에 문자 코덱을 이용한 변환이 필요하기 때문이다. 이것은 큰 log file들 같은 거대한 양의 text data에 대한 처리에서 눈에띄게 느릴 것이다. 또한 TextIOWrapper.tell() 과 TextIOWrapper.seek() 의 경우는 사용되는 재구성 방법(reconstruction algorithm)때문에 둘다 꽤나 느리다.

그러나 StringIO는 native in-memory unicode container면서 BytesIO 와 유사한 속도를 보여줄 것이다.

Multi-threading

FileIO object들은 그들이 wrap 한 Unix 에서 read(2) 같은 os system call 들 thread-safe 하는 정도로 thread-safe 하다. (me: 그들이 wrap 한 함수만큼 thread-safe 하다는 말인듯.)

Binary buffered object들(BufferedReader,BufferedWriter,BufferedRandom,BufferedRWPair)은 그들의 내부 구조들을 lock 을 이용해서 보호한다. 그렇기 때문에 여러 thread 에서 동시에 그들을 호출하는것이 안전하다.

TextIOWrapper object들은 thread-safe 하지 않다.

Reentrancy

Binary buffered object들(BufferedReader,BufferedWriter,BufferedRandom,BufferedRWPair)은 reentrant 하지 않다. 반면에 reentrant call들은 일반적인 상황에서 일어나지 않지만, 그들은 signal handler 에서 I/O 를 하고 것으로부터 일어날 수 있다.

만약에 thread 가 이미 접근했던 buffered object 에 재진입(re-enter) 하려고 노력한다면, RuntimeError 가 발생된다. 하지만 이것은 다른 thread 가 buffered object 로 들어가는 것을 막지 않는다.

위의 이야기는 잠재적으로 text file들로 확장된다. open() 함수가 TextIOWrapper 내에서 buffered object 를 wrap 할 것이기 때문이다.

정리

명확하지 않지만, 지금 뇌피셜로 어느정도 flow 를 정리해보면, 아래와 같지 않을까 싶다. 일단 아래 사항은 그저 뇌피셜로 만든것이라 너무 믿지말자.

  write('fds') ---->  memory (os buffer) ----> device buffer ---> device
  write('fds') ---->  memory (ByteIO buffer) ----> memory (os buffer) ----> device buffer ---> device

text i/o 작업할때 좀 더 빠른 성능은 StringIO 를 이용하자.
대체로 buffered I/O 가 빠르다.
binary buffered object 들은 thread-safe 하다

StringIO 내용을 file 로 wrtie 하기

python - What is the best way to write the contents of a StringIO to a file? - Stack Overflow

shutil 를 이용해서 복사한다.

with open('file.xml', 'w') as fd:
  buf.seek(0)
  shutil.copyfileobj(buf, fd)

References

io — Core tools for working with streams — Python 3.10.7 documentation

[컴] apple map 데이터 제공업체

애플 맵 데이터 출처 / 어디서 맵 데이터 / mapping data / map data / 애플맵 정보

apple map 데이터 제공 업체

iOS 6 에서 apple map 이 나왔다.

TomTom
OpenStreetMap
3D map 렌더링 : C3 (애플이 인수한 회사 2011년 가을)

Reference

Apple using TomTom and OpenStreetMap data in iOS 6 Maps app - The Verge, 2012-06-11

[컴] KMS 사용법

windows kms / key management system/

KMS 사용법

KMS 를 위해선 다음 2가지가 필요하다.

KMS host
KMS client key(공식명칭: Microsoft General Volume Licencse Key)

일반 볼륨 라이선스 키(GVLK), Windows 11 및 Windows 10(반기 채널 버전) | Microsoft Docs

slmgr

slmgr /skms <host name:port> : kms host 설정
slmgr /ato : 정품인증 요청
slmgr /xpr : 정품인증 확인 및 만료일 확인
slmgr.vbs /dti : 설치 id 를 알려준다.
slmgr.vbs /dlv : 현재 설치된 라이센스 정보를 보여준다.

local PC에 제품키 설치 하는 방법

slmgr /ipk W269N-WFGWX-YVC9B-4J6C9-T83GX

kms 호스트 설정

slmgr /skms <kms host dns or ip>

정품인증

slmgr /ato

Reference

[컴] Reactor netty 에서 initial line length 변경 방법

DEFAULT_MAX_INITIAL_LINE_LENGTH

Reactor netty 에서 initial line length 변경 방법

curl "http://localhost:8080/my-param-34239892852....?a=fjkdjsljgkl"

위와 같이 너무 긴 URI 를 'Reactor Netty 를 사용하는 WAS(Web application Server)' 로 request 를 보냈다.

그래서 다음과 같은 error 가 발생했다.

io.netty.handler.codec.http.TooLongHttpLineException : an HTTP line is larget than 4096 bytes.
  ...

이슈의 원인은 netty 의 기본 설정된 DEFAULT_MAX_INITIAL_LINE_LENGTH 값때문이다.

수정방법

아래 코드 처럼 maxInitialLineLength() 를 해주면 된다.

import reactor.core.publisher.Mono;
import reactor.netty.DisposableServer;
import reactor.netty.http.server.HttpServer;

public class Application {

    public static void main(String[] args) {
        DisposableServer server =
                HttpServer.create()
                          .httpRequestDecoder(spec -> spec.maxInitialLineLength(16384)) 
                          .handle((request, response) -> response.sendString(Mono.just("hello")))
                          .bindNow();

        server.onDispose()
              .block();
    }
}

References

5.5.3. HTTP Request Decoder | Reactor Netty Reference Guide

[컴] kotlin 에서 back reference 사용법

special replacement pattern / regex / 정규표현식 / 코틀린 / regex 사용법 / 정규표현식 사용법

kotlin 에서 back reference 사용법

다음은 search keyword 에 ()(괄호) 또는 [](대괄호) 앞에 ‘\’(역슬래시 2개) 를 붙이는 코드이다.

val searchKeyword = "(mytest) [1234]hello"
val regex = Regex("([\\[\\]\\(\\)])")
val restr1 = regex.replace(searchKeyword){
    "\\\\${it.groupValues[1]}"
}
val restr2 = regex.replace(searchKeyword, "\\\\\\\\$1")

Reference

regex - How to use back references in kotlin regular expressions? - Stack Overflow

[컴] pyMongo 사용

python mongodb / mongo / mython mongo

pyMongo 사용

설치

pip install pyMongo

사용예제

from pymongo import MongoClient
from bson.son import SON
import datetime

uri = "mongodb://user:password@example.com:27017/default_db?authSource=admin"
client = MongoClient(uri)
db = client.mydatabase

pipeline = [
   { "$match": { "time": { 
       "$gte": datetime.datetime(2022, 8, 6),
       "$lt": datetime.datetime(2022, 8, 7)
      }}
    },
    {"$unwind": "$tags"},
    {"$group": {"_id": "$tags", "count": {"$sum": 1}}},
    {"$sort": SON([("count", -1), ("_id", -1)])}
]

# 출력 - list 이용하는 방법
import pprint
pprint.pprint(list(db.testcoll.aggregate(pipeline)))

# 출력 - cursor 를 이용하는 방법
cursor = db.testcoll.aggregate(pipeline)
for c in cursor:
    print(c)

기타

Frequently Asked Questions — PyMongo 4.2.0 documentation
- MongoClient() 는 thread-safe 하다.
- MongoClient() 는 connection pool 을 가지고 있다. 기본 값은 100 이다. parameter 로 조정가능

java 에서 stream 과 reader/writer 의 차이

stream

readers/writers

buffered I/O

Reference

python io 성능관련 글

성능

Binary I/O

Text I/O

Multi-threading

Reentrancy

정리

StringIO 내용을 file 로 wrtie 하기

See Also

References

apple map 데이터 제공 업체

Reference

KMS 사용법

slmgr

local PC에 제품키 설치 하는 방법

kms 호스트 설정

정품인증

See Also

Reference

Reactor netty 에서 initial line length 변경 방법

수정방법

References

kotlin 에서 back reference 사용법

Reference

pyMongo 사용

설치

사용예제

기타

Reference