2020. május 29., péntek

PDF fájl átalakítása Wiki formátumra

sudo apt install pandoc
sudo apt-get install poppler-utils

PDF --> HTML
sudo mkdir kimenet
sudo pdftohtml -s -p -fmt png -nodrm "file.pdf" "file/file.html"

You can type pdftohtml -h to gain a better understanding of available parameters.
I've explained the parameters used here for the sake of understanding the command:
  • -s contains all of the output within one HTML document (excluding the outline.
  • -p attempts to replaces pdf internal linking with html links.
  • -fmt controls the output format of images, with png and jpg being valid options.
  • -nodrm igores download rights management restrictions on the PDF.
  • -i ignores images. I didn't use this, but it felt prudent to mention as in some cases it may massively speed your output format.

Alternatív módszer: Poppler pdftotext

pdftotext -htmlmeta "file.pdf" "file.html"

 Replace "file" with the name of the file you want to parse and with the name of the HTML file you want to write your text output to. 
 The `-htmlmeta` option creates an HTML version of the text in your PDF. (This is much less fancy than the previous command and only puts the text in `pre` tags). You should see an HTML file in your directory which you can open to check the results of. Depending on the formatting of your source PDF file, you may find that Poppler is variable in it's effectiveness. You can try running `pdftotext -h` for information on other command options that may improve or worsen your results. 

Pandoc: HTML --> MediaWiki

 pandoc file.html -f html -t mediawiki -s -o file.txt
  • -f bemeneti formátum
  • -t kimeneti formátum
  • -s Standalone adds a header and footer to the document, rather than producing a document fragment.
  • -o The name of the output file.
Pandoc user guide.
It is possible you may run into an error with Pandoc, presumably caused by your file being too large. I ran into this error and some fixes can be found here.

Opció: rossz kódolás kitakarítása

Depending on your PDF encoding, you may find strange Unicode charecters in your HTML output. This step is intended to clean up this output to the best possible degree of accuracy. ftfy, stands for fixes text for you, and it's a Python library with a command-line interface. We'll be using the command line to clean our files. This step is preformed before using Pandoc.

ftfy telepítése:
git clone https://github.com/LuminosoInsight/python-ftfy.git
cd python-ftfy
sudo python setup.py install
Or, if you system has pip, pip install ftfy. Note that if you want to use a version of 5.0 (most recent available at time of writing) or later, you need Python 3. I used Python 2.x with ftfy 4.1.1 for this answer. Using the same directory, type the following command:
 ftfy -o file_clean.html --preserve-entities file.html
Optionally, you may include the --guess option to have ftfy guess your encoding, or --encoding if you know your encoding. This may produce better results.

2020. május 28., csütörtök

Resolving the 502 Bad Gateway Error in Nginx, Ubuntu 16.04 - 20.04 upgrade and PHP5 - PHP7 upgrade

PHP 7.2 Ubuntu 20.04 502 Bad gateway Error message

Set path correctly in
sudo nano /etc/nginx/sites-available/default
sudo nano /etc/nginx/snippets

BAD: fastcgi_pass unix:/var/run/php/php7.0-fpm.sock;
GOOD: fastcgi_pass unix:/var/run/php/php7.2-fpm.sock;
---


sudo nano /etc/php/7.0/fpm/pool.d/www.conf

change
listen = 127.0.0.1:9000
to
listen = /var/run/php7.2-fpm.sock


sudo apt-get -y install php7.2 php7.2-mysql php7.2-fpm php-fpm

chown :www-data /var/run/php/php7.2-fpm.sock

sudo apt install php-mysql

Reading package lists... Done
Building dependency tree     
Reading state information... Done
The following additional packages will be installed:
  php7.2-mysql
The following NEW packages will be installed:
  php-mysql php7.2-mysql
0 upgraded, 2 newly installed, 0 to remove and 0 not upgraded.
Need to get 125 kB of archives.
After this operation, 432 kB of additional disk space will be used.
Do you want to continue? [Y/n] y
Get:1 http://us.archive.ubuntu.com/ubuntu bionic-updates/main i386 php7.2-mysql i386 7.2.24-0ubuntu0.18.04.6 [123 kB]
Get:2 http://us.archive.ubuntu.com/ubuntu bionic/main i386 php-mysql all 1:7.2+60ubuntu1 [2,004 B]
Fetched 125 kB in 0s (266 kB/s)     
Selecting previously unselected package php7.2-mysql.
(Reading database ... 86153 files and directories currently installed.)
Preparing to unpack .../php7.2-mysql_7.2.24-0ubuntu0.18.04.6_i386.deb ...
Unpacking php7.2-mysql (7.2.24-0ubuntu0.18.04.6) ...
Selecting previously unselected package php-mysql.
Preparing to unpack .../php-mysql_1%3a7.2+60ubuntu1_all.deb ...
Unpacking php-mysql (1:7.2+60ubuntu1) ...
Setting up php7.2-mysql (7.2.24-0ubuntu0.18.04.6) ...

Creating config file /etc/php/7.2/mods-available/mysqlnd.ini with new version

Creating config file /etc/php/7.2/mods-available/mysqli.ini with new version

Creating config file /etc/php/7.2/mods-available/pdo_mysql.ini with new version
Setting up php-mysql (1:7.2+60ubuntu1) ...
Processing triggers for libapache2-mod-php7.2 (7.2.24-0ubuntu0.18.04.6) ...
Processing triggers for php7.2-fpm (7.2.24-0ubuntu0.18.04.6) ...
NOTICE: Not enabling PHP 7.2 FPM by default.
NOTICE: To enable PHP 7.2 FPM in Apache2 do:
NOTICE: a2enmod proxy_fcgi setenvif
NOTICE: a2enconf php7.2-fpm
NOTICE: You are seeing this message because you have apache2 package installed.



sudo service php7.2-fpm restart
sudo service php-fpm restart
sudo service nginx restart


Check /var/log/nginx/error.log if sth is still not ok.

2020. május 26., kedd

Syntax errors in translated files - Letter to poeditor.com

I've used poeditor with Google Translate to translate open source software - LearnPress and Give-WP plugins.

I've paid 8$ yet the output is full of syntax errors - special characters like & and $ are messed up, some important extra lines omitted starting with # (see e.g. Learnpress WP plugin PO file, lines: "#, php-format").

Examples from Give-WP translation - full source below:

1. Unwanted character conversion - and worse, bad output due to extra space:
msgid "Next »"
msgstr "Következő & raquo;"

2. Extra spaces added that breaks variables:
msgid "Edit Donor: %1$s %2$s"
msgstr "Adományozó szerkesztése:% 1 $ s% 2 $ s"

3. Wrong character encoding:
msgid "Before - %s‎10"
msgstr "Előtt -% s &# x200e; 10"

etc...

Original:
https://www.pastefs.com/pid/211223

Translated but with syntax errors
https://www.pastefs.com/pid/211222

The Transifex translation portal also does not accept the file you generate as an input file due to the syntax errors.

Can you fix these errors?

2019. december 22., vasárnap

OnePlus 3 gyári visszaállítás

OP3 factory reset:
https://www.androidsage.com/2017/03/22/restore-to-stock-oneplus-3-3t-with-latest-oxygen-os-firmware-complete-unroot/

OP3 Recovery Image:
http://oxygenos.oneplus.net.s3.amazonaws.com/recovery_op3.img

Fastboot Ubuntu:
https://blog.droidzone.in/2017/10/16/install-adb-and-fastboot-for-oneplus-3/

Custom TWRP Recovery:
https://twrp.me/oneplus/oneplusthree.html

2019. január 25., péntek

Send mass emails from Excel

If you create four columns:
Name Email Link Status

You can use this VBA macro to send the content of links to the specified emails.

Sub test2()

    Dim OutApp As Object
    Dim OutMail As Object
    Dim cell As Range

    Application.ScreenUpdating = False
    Set OutApp = CreateObject("Outlook.Application")

    For Each cell In Worksheets("Sheet1").Columns("B").Cells
        Set OutMail = OutApp.CreateItem(0)
        If cell.Value Like "?*@?*.?*" Then      'try with less conditions first
            With OutMail
                .To = Cells(cell.Row, "B").Value
                .Subject = "Your SUBJECT"
                .Body = "Hi " + Cells(cell.Row, "A") + "," + vbCrLf + vbCrLf + "YOUR_MESSAGE" + vbCrLf + " Your  result: " + vbCrLf + Cells(cell.Row, "C").Value + vbCrLf + vbCrLf + "Best regards,"
                .display
                'Stop                            'wait here for the stop
            End With
            Cells(cell.Row, "D").Value = "sent"
            Set OutMail = Nothing
        End If
    Next cell

    'Set OutApp = Nothing                        'it will be Nothing after End Sub
    Application.ScreenUpdating = True

End Sub

2019. január 22., kedd

Generate OpenSSL key for webapp

openssl req -x509 -out ssl.crt -keyout ssl.key \
  -newkey rsa:2048 -nodes -sha256 \
  -subj '/CN=localhost' -extensions EXT -config <( \
   printf "[dn]\nCN=localhost\n[req]\ndistinguished_name = dn\n[EXT]\nsubjectAltName=DNS:localhost\nkeyUsage=digitalSignature\nextendedKeyUsage=serverAuth")

2018. augusztus 17., péntek

Beyond browsers: HTTP/2 and WebSockets

Status: Draft

Architectural considerations for choosing a communication protocol

The purpose of this article to give a high-level overview with some practical insights into the current state of communication protocols frequently used across internet-enabled devices.

HTTP/2 is an IETF proposed standard (referenced as RFC 7540) that has been issued in May 2015. It is the continuation of the widely used HTTP 1.1 version that has been published in 1997. HTTP/2 is still a relatively new piece of technology, according to caniuse.com, about 76% of internet-connected clients can take advantage of its presence on a web server.

One could ask: why do we need a new core protocol for the internet, when the previous served us great for almost two decades? The answer lies in the fact that Internet changed from being niche to the basic fabric of our societies. In 1997, there were about 120 million internet users. Today, this number is more than 3,4 billion. Web pages become increasingly complex, popular sites often creating as many as forty TCP connections for a single client (check netstat | grep http). This and many other workarounds, such as embedding CSS or JS in HTML documents to avoid extra file transfers, show that delivering a rich web experience needs a more fundamental redesign, in this case on the transport protocol level.

HTTP/2 provides a set of well-thought improvements over its predecessor: sharing a TCP connection between resources becomes possible without one blocking another. This is achieved by multiplexing that creates small chunks of data and interleaves them with each other. Header compression reduce the HTTP overhead while preserving the trait of being a stateless protocol. Changing to a binary wire format from text-based will mandate the use of special tools (e.g. WireShark 2.0 or newer) but it will also make processing faster and straightforward.


HTTP/1.1
HTTP/2
Format
Text-based
Binary
Multiplexing (a single TCP connection for multiple data streams)
No
Yes
Header compression
No
Yes
Server push
No
Yes
Prioritization
No
Yes
WebSockets
HTTP/2
Format
Binary
Binary
Multiplexing
No
Yes
Header compression
N/A
Yes
gRPC
Socket.io
Format
Binary
Binary (WebSocket)
Multiplexing
Header compression

If the goal is to keep connection through an unlocked arrive at the data, it may be that we should use HTTP instead of WebSocket was.

But if you insist on for HTTP, then the implementation will depend on the client side to receive the data in order to manage. The solution is probably the most reliable Server Sent Events. To do that you can imagine in this Node.js library to use: https://www.npmjs.com/package/sse but it may be that the server does not work with HTTP2 (in this case, if you send an error report, you can count on it that the the problem will be fixed deadline but I can not guarantee :)).

If the client-side JavaScript code will not, but the browser itself will process the data (eg, video streaming, etc.) You can use the standard Node.js stream classes to make more installments submit the data to the client, this class is a descendant of HTTP Response object. Such a receiving stream from JS is not fully reliable, that the package can be shifted borders, etc. so I do not recommend it if you want to receive the incoming data in JS.

But there are higher-level abstractions, for example. socket.io the library, which hides the details of implementation and provides bi-directional packet data communication between the client and the server. This is probably the easiest solution if you do not care about the transmission medium, you receive only the data simply and reliably.

Actually, all of which they are independent from the HTTP2, only the HTTP2 push server could be used for something similar, but this is not yet used by JS API, so you think it will fail.

---


Now let's abstract away from browser-specific details.


The HTTP2 would also be a good choice if Browser server communication is not the case. The GRPC, which Google RPC framework for the HTTP2 from buildings, for instance.


If the low-level streams to HTTP2 are accessible (not just through the browser JS APIs), it give much more flexibility in the protocol. The node http2 in such a low level of access gives the http2.protocol API, which is broadly documented here: https://github.com/molnarg/node-http2/blob/master/lib/protocol/index.js

Low-level stream of calls handled are Node.js Stream API (cork / uncork I will destroy used indefinitely flush roughing), en thought that the browser is not something :) JS APIjaban but TCP is not recommended in some cases based on this, if luggage based communication you want to implement, because the area may want to package the client can shift due to the buffering. Instead, the HTTP2 has a framing layer, which in turn can be used for packet based communication when approaching low-level access to the HTTP2 implementations (such as http2.protocol API).

The HTTP2 push API can be interesting if you look at your ass non-browser client. The only strange saga that's always a push to stream should have a pre-existing non-streaming push you. The Web is about both the validity, to say the HTML file is download the client and the server push to the corresponding CSS, JS, etc. files. If you are not a web of context, it is not unique to mar what belongs push streams, in many contexts does not make sense to declare it.

If the framing layer is HTTP2, push, or other low-level HTTP2 things you want to use a PoC establishment, good question which one should you use HTTP2 lib, I suspect that this sort of thing is no one is too good documentation and API

Selecting the most suitable communication protocol for industry devices or Internet of Things (IoT) is a challenging task. Changing an existing architecture is even more so. Creating a migration plan is required to ensure a good fit with existing deployments and network characteristics. Other protocols, such as CoAP or MQTT are also popular in the IoT world, but HTTP/2 can be also satisfactory in many use cases where resource constraints are not so low as described in RFC 7228 (10 KiB of RAM and 100 KiB of code space). This could be especially true when used with a battle-hardened framework like gRPC. Which has just been released on 23 August 2016 with version 1.0.

A small remark: we can remember Gmail being titled as “Beta” even when it far surpassed any other mail provider in quality, so version 1.0 at Google means a different thing than in other large software houses.

gRPC perfectly fits other criteria as well: it is based on standards, it’s open-source and even the wire-level data format is well-documented – therefore vendor lock-in can be easily avoided.

Before we go further, let’s see what other architectural approaches we could consider.

Should a communication middleware be used?

Middleware solutions provide loose coupling and extra features to deal with communication tasks in a complex enterprise environment. However, these extra features come with an added installation and management complexity that are not rationalised by all use cases. So as always, a careful analysis should take place regarding pros and cons.

Robust communication

Middleware solutions provide services like guaranteed delivery that are useful if a client or network error occurs.

Deploying middleware

Firewall ports

Firewall management is a cumbersome process in corporate environments. Well, for good reasons, firewall policies are fundamental to IT security. In addition to providing a business rationale, usually more than one managers need to authorize such a change. Therefore it often takes days or even weeks to get through.

Reasons for choosing HTTP-REST as a base protocol

·         Port 80 and 443 is open everywhere

·         Lo-REST (using GET and POST requests) can pass almost all firewalls even where other HTTP methods may not get through

·         Middleware solutions need a special port (AMQP: 5672, MQTT: 1883/8883), managing it in corporate IT environment can be cumbersome

Choosing HTTP-REST as a base protocol

Developers know it, easy to start with, go through firewalls

Challenges

TCP is bidirectional but how to realise in practice with request-response semantics

Low-level approach – streams and frames

·         Browser focus

·         API documentations

GRPC

·         Features

·         Characteristics of a truly open and mature communication protocol